Glyn Astill glynastill at yahoo.co.uk
Tue Aug 6 08:33:58 PDT 2013
Hi Guys,

We're running slony 2.1.3, and one of my slaves has failed.  The issue is that the failed slave node is a provider to another downstream slave; am I right in thinking I have to drop both the failed node and the downstream subscriber slave?

My setup basically looks like this, where subscriber2 has failed:


origin ---> subscriber1

       ---> subscriber2 ---> subscriber3

        


First I tried to reshape the subscription on subscriber3, but this didn't work:

SUBSCRIBE SET ( ID=@my_set, PROVIDER = @origin, RECEIVER = @subscriber3, FORWARD = YES);

This failed with the following message:

glyn at x:/usr/share/slonik$ slonik reshape_provider.scr
reshape_provider.scr:3: could not connect to server: Connection refused
        Is the server running on host "10.16.10.101" and accepting
        TCP/IP connections on port 5432?

Where 10.16.10.101 is the IP of subscriber2. So I tried to just drop the node:

DROP NODE ( ID = @subscriber2, EVENT NODE = @origin );

And the following happened:

glyn at x:/usr/share/slonik$ slonik drop_node.scr
drop_node.scr:3: could not connect to server: Connection refused
        Is the server running on host "10.16.10.101" and accepting
        TCP/IP connections on port 5432?
waiting for events  (7,5014269532) only at (7,5014260307) to be confirmed on node 5
waiting for events  (7,5014269532) only at (7,5014260307) to be confirmed on node 5
waiting for events  (7,5014269532) only at (7,5014260307) to be confirmed on node 5
waiting for events  (7,5014269532) only at (7,5014260307) to be confirmed on node 5
waiting for events  (7,5014269532) only at (7,5014260307) to be confirmed on node 5


Where "node 5"is subscriber3.

So now slonik is waiting on subscriber3 to come in sync, but it's just trying to sync from subscriber2 which has gone.  Heres the log from subscriber3:

2013-08-06_163034 BSTERROR  slon_connectdb: PQconnectdb("dbname=SEE host=10.16.10.101 user=slony") failed - could not connect to server: Connection refused
        Is the server running on host "10.16.10.101" and accepting
        TCP/IP connections on port 5432?
2013-08-06_163034 BSTWARN   remoteListenThread_4: DB connection failed - sleep 10 seconds
2013-08-06_163034 BSTDEBUG2 remoteWorkerThread_7: SYNC 5014260308 processing
2013-08-06_163034 BSTERROR  slon_connectdb: PQconnectdb("dbname=SEE host=10.16.10.101 user=slony") failed - could not connect to server: Connection refused
        Is the server running on host "10.16.10.101" and accepting
        TCP/IP connections on port 5432?
2013-08-06_163034 BSTERROR  remoteWorkerThread_7: cannot connect to data provider 4 on 'dbname=SEE host=10.16.10.101 user=slony'
2013-08-06_163034 BSTDEBUG2 remoteListenThread_7: queue event 7,5014270211 SYNC
2013-08-06_163034 BSTDEBUG2 remoteWorkerThread_8: forward confirm 7,5014270210 received by 8
2013-08-06_163036 BSTDEBUG2 syncThread: new sl_action_seq 1 - SYNC 5005139878
2013-08-06_163036 BSTDEBUG2 remoteListenThread_7: queue event 7,5014270212 SYNC
2013-08-06_163036 BSTDEBUG2 remoteListenThread_8: queue event 8,5013135166 SYNC
2013-08-06_163036 BSTDEBUG2 remoteWorkerThread_8: Received event #8 from 5013135166 type:SYNC
2013-08-06_163036 BSTDEBUG1 calc sync size - last time: 1 last length: 10069 ideal: 5 proposed size: 3
2013-08-06_163036 BSTDEBUG2 remoteWorkerThread_8: SYNC 5013135166 processing
2013-08-06_163036 BSTDEBUG1 remoteWorkerThread_8: no sets need syncing for this event
2013-08-06_163036 BSTDEBUG2 remoteWorkerThread_8: forward confirm 7,5014270211 received by 8
2013-08-06_163042 BSTDEBUG2 localListenThread: Received event 5,5005139878 SYNC
2013-08-06_163042 BSTDEBUG2 remoteListenThread_7: queue event 7,5014270213 SYNC
2013-08-06_163042 BSTDEBUG2 remoteListenThread_7: queue event 7,5014270214 SYNC
2013-08-06_163042 BSTDEBUG2 remoteListenThread_7: queue event 7,5014270215 SYNC
2013-08-06_163042 BSTDEBUG2 remoteWorkerThread_8: forward confirm 5,5005139878 received by 8
2013-08-06_163042 BSTDEBUG2 remoteWorkerThread_8: forward confirm 7,5014270214 received by 8



So what do I do?  I presume I'll be waiting forever, so do I kill slonik and drop subscriber3 too?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20130806/1bf78aae/attachment.htm 


More information about the Slony1-general mailing list