Fri Jan 29 16:30:11 PST 2010
- Previous message: [Slony1-general] drop node not working correctly
- Next message: [Slony1-general] Cleanup operation
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> On Jan 14, 2010, at 9:06 AM, Christopher Browne wrote: >>> Does the below line mean it is waiting for some kind of notification >>> from somewhere? : >>> DEBUG2 ACCEPT_SET - MOVE_SET or FAILOVER_SET not received yet - >>> sleep >> Yup, that indicates that node #3 hasn't completed the failover. It >> hasn't fully accepted the new provider. >> I'm not sure what to suggest on that. > The problem lies in the fact that when FAILOVER_SET is issued, the event origin is the failed node (see slony1_funcs.sql:1318). Subsequently, when the DROP_NODE command is issued to drop the failed node, it will delete from sl_event any rows where ev_origin=failed nodeID (see slony1_funcs.sql:1078). This will delete the FAILOVER_SET row from the sl_event table, as ev_origin=failed nodeID. Also, note that the ACCEPT_SET event is generated by the FAILOVER_SET command, under failoverSet_int (see slony1_funcs.sql:1391). After the DROP_NODE event happens on the backup node (new provider), the subscribers looking for events on the backup node will find an ACCEPT_SET event, attempt to confirm and process it, but they ultimately fail because they can no longer find/confirm/process the FAILOVER_NODE event as it had been removed by the DROP_NODE event. This, it (the subscriber node) enters into an infinite sleep/retry/ fail loop as you have experienced. Therefore, in order to successfully drop the failed node, one must wait for the FAILOVER_SET event to propagate to the subscribers (you must wait at least <sync_interval> milliseconds) before issuing the DROP_NODE command. The question I have is--shouldn't the ev_origin for FAILOVER_SET be the nodeID of the backup node? Seems like it wouldn't make sense that a failed node would create an event; the system should state that the backup node is the one creating the event, right? I suppose this would be the remedy for the problem. --Richard
- Previous message: [Slony1-general] drop node not working correctly
- Next message: [Slony1-general] Cleanup operation
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list