At the server startup, slony daemon crashed with the following error
FATAL localListenThread: "select "_xxxx".cleanupNodelock(); insert into "_xxxx".sl_nodelock values ( 9151, 0,
"pg_catalog".pg_backend_pid()); " - ERROR: duplicate key value violates unique constraint "sl_nodelock-pkey"
DETAIL: Key (nl_nodeid, nl_conncnt)=(9151, 0) already exists.
2013-07-23 07:10:16 UTC FATAL Do you already have a slon running against this node?
2013-07-23 07:10:16 UTC FATAL Or perhaps a residual idle backend connection from a dead slon?
2013-07-23 07:10:16 UTC DEBUG2 slon_abort() from pid=1699
2013-07-23 07:10:16 UTC FATAL main: localListenThread did not start
2013-07-23 07:10:16 UTC CONFIG slon: child terminated signal: 9; pid: 1699, current worker pid: 1699
2013-07-23 07:10:16 UTC INFO slon: done
I know the origin of the duplicate key error, but I expected the Watchdog to try every 10s to start the daemon again.
In my case it seems like the watchdog process died with the child process.
The propose solution is to add a parameter in the slony configuration file that set how many time the watchdog process will try before end of live.
Patch for feature implementation: https://github.com/wieck/slony1-engine/commit/e4285eba5740dfe535925af232086e0ab3d0077b
This looks fine to me