Brian A. Seklecki lavalamp at spiritual-machines.org
Fri May 22 09:49:51 PDT 2009
All:

So this problem with slon(8) daemons continues to vex us.  During a
switchover, we see "No Worker Thread" errors:

 2009 May 22 06:37:17 -04:00 bdb01 [slon][55352] [local2] [err] 
 slon[55352]: [12-1] [55352] CONFIG storeSet: set_id=1 set_origin=3
 set_comment='All CORES tables'
 2009 May 22 06:37:17 -04:00 bdb01 [slon][55352] [local2] [warning]
 slon[55352]: [13-1] [55352] WARN   remoteWorker_wakeup: node 3 - no
 worker thread

Followed by:


 2009 May 22 06:37:17 -04:00 bdb01 [slon][55352] [local2] [err]
 slon[55352]: [19-1] [55352] FATAL  localListenThread: "select 
 "_DBNAME".cleanupNodelock(); insert into
 2009 May 22 06:37:17 -04:00 bdb01 [slon][55352] [local2] [err]
 slon[55352]: [19-2]  "_DBNAME".sl_nodelock values (   
 2, 0, "pg_catalog".pg_backend_pid()); " - ERROR:  duplicate key value
 violates

The screwed up thing is that, as far as we know, all three slon(8)
daemons on all there configurations are active, healthy, and responding
before we execute the switchover.

We know because we have nagios watching SYNC events and watching that
sl_log table row counts are within acceptable ranges.

Any advice on further troubleshooting this?    Maybe attach a ktrace(8)
to the process and try to re-create the error.

We're running the latest Slony/PostgreSQL (postgresql-server-8.3.7 +
slony1-1.2.15) on FBSD6/amd64.

~BAS



More information about the Slony1-general mailing list