Richard Yen dba at richyen.com
Mon Jan 25 21:12:44 PST 2010
On Jan 14, 2010, at 9:06 AM, Christopher Browne wrote:
>> Does the below line mean it is waiting for some kind of notification
>> from somewhere? :
>>  DEBUG2 ACCEPT_SET - MOVE_SET or FAILOVER_SET not received yet -  
>> sleep
> Yup, that indicates that node #3 hasn't completed the failover.  It
> hasn't fully accepted the new provider.
> I'm not sure what to suggest on that.

I'm actually experimenting with this issue on slony 2.0.3 rc3.  It  
seems that there is a race condition (I'm still trying to pinpoint  
exactly where it is) where if you call the DROP_NODE command before  
the FAILOVER_SET command is completed, the ACCEPT_SET command never  
makes it into sl_event.

However, I have found that if I put a sleep (even 1 second works in my  
environment), the DROP_NODE command succeeds, and everything proceeds  
happily.

I don't know exactly where the race condition is, but it appears that  
if DROP_NODE is called too early, the failoverSet_int() trigger never  
gets fired (or perhaps something deletes from sl_event prematurely).

I'm continuing to debug this with gdb, and if you guys might have some  
sort of direction as to where to look, that would be much appreciated.

--Richard


More information about the Slony1-general mailing list