Richard Yen dba at richyen.com
Fri Jan 29 16:30:11 PST 2010
> On Jan 14, 2010, at 9:06 AM, Christopher Browne wrote:
>>> Does the below line mean it is waiting for some kind of notification
>>> from somewhere? :
>>> DEBUG2 ACCEPT_SET - MOVE_SET or FAILOVER_SET not received yet -
>>> sleep
>> Yup, that indicates that node #3 hasn't completed the failover.  It
>> hasn't fully accepted the new provider.
>> I'm not sure what to suggest on that.
>

The problem lies in the fact that when FAILOVER_SET is issued, the  
event origin is the failed node (see slony1_funcs.sql:1318).   
Subsequently, when the DROP_NODE command is issued to drop the failed  
node, it will delete from sl_event any rows where ev_origin=failed  
nodeID (see slony1_funcs.sql:1078).  This will delete the FAILOVER_SET  
row from the sl_event table, as ev_origin=failed nodeID.

Also, note that the ACCEPT_SET event is generated by the FAILOVER_SET  
command, under failoverSet_int (see slony1_funcs.sql:1391).

After the DROP_NODE event happens on the backup node (new provider),  
the subscribers looking for events on the backup node will find an  
ACCEPT_SET event, attempt to confirm and process it, but they  
ultimately fail because they can no longer find/confirm/process the  
FAILOVER_NODE event as it had been removed by the DROP_NODE event.   
This, it (the subscriber node) enters into an infinite sleep/retry/ 
fail loop as you have experienced.

Therefore, in order to successfully drop the failed node, one must  
wait for the FAILOVER_SET event to propagate to the subscribers (you  
must wait at least <sync_interval> milliseconds) before issuing the  
DROP_NODE command.

The question I have is--shouldn't the ev_origin for FAILOVER_SET be  
the nodeID of the backup node?  Seems like it wouldn't make sense that  
a failed node would create an event; the system should state that the  
backup node is the one creating the event, right?  I suppose this  
would be the remedy for the problem.

--Richard



More information about the Slony1-general mailing list