[Slony1-general] Working out who is master after failover

Wed May 4 09:13:45 PDT 2005

From: "Andrew Sullivan" <ajs at crankycanuck.ca>

> If I understand your question correctly, it translates to, "If node A
> doesn't know event E has happened, how does node A know E has
> happened?"  The answer, of course, is, "It doesn't."  By definition,
> it can't always know that it has failed from the point of view of
> those outside itself, because the definition of "failure" has to do
> with the state of those other cluster members.

So if I'm alone in my office, fall asleep, then wake up wondering whether 
I'm still boss (hey, this is an analogy, give me some poetic licence ;-)), I 
can't tell the difference between a situation where my colleagues have 
decided to leave me in peace to get some work done, and one where my 
colleagues have locked the door to my office and replaced me as boss because 
I was asleep -- unless of course I try the door, which breaks the paradigm 
by turning me into a door-checker.  Fair enough, but if the latter 
situation, it would have been polite for them to have sent me an email 
saying "you're fired".

My understanding of Slony's switch/failover (please correct it if it's 
wrong!) is that:

1) Switchover with MOVE SET seems to require the cooperation of the old 
origin at the time of the switchover.  If I try to do:

lock set (id = 1, origin = 1);
wait for event (origin = 1, confirmed = 2);
move set (id = 1, old origin = 1, new origin = 2);
wait for event (origin = 1, confirmed = 2);

then I need node 1 to be alive before the transfer is complete.

2) Failover with FAILOVER makes no attempt to contact the old origin node 
because it's assumed that the old origin is incommunicado. It just grabs the 
set.

Isn't there a useful middle ground?  A failover that includes an 
asynchronous message for the old origin to say "you're fired", or more 
precisely, "you are no longer the origin for set id = 1".  Perhaps I'm 
missing the point.  Perhaps MOVE SET does exactly that(?)

That won't fix a situation where a client can connect to node 1 and node 2 
but node 1 has no connectivity to node 2. So when node 1 wakes it starts 
serving the client before it gets the "you're fired" message.  But for 
situations where node 1 has simply been offline for a while, which I think 
are likely to be more common, it would make sense to be able to tell it of 
its situation when it wakes.

Or can I do that simply by preceding my FAILOVER instruction with a MOVE SET 
or even a DROP NODE, which node 1 will receive when it wakes?

Julian