Wed May 4 09:13:45 PDT 2005
- Previous message: [Slony1-general] Working out who is master after failover
- Next message: [Slony1-general] Working out who is master after failover
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
From: "Andrew Sullivan" <ajs at crankycanuck.ca> > If I understand your question correctly, it translates to, "If node A > doesn't know event E has happened, how does node A know E has > happened?" The answer, of course, is, "It doesn't." By definition, > it can't always know that it has failed from the point of view of > those outside itself, because the definition of "failure" has to do > with the state of those other cluster members. So if I'm alone in my office, fall asleep, then wake up wondering whether I'm still boss (hey, this is an analogy, give me some poetic licence ;-)), I can't tell the difference between a situation where my colleagues have decided to leave me in peace to get some work done, and one where my colleagues have locked the door to my office and replaced me as boss because I was asleep -- unless of course I try the door, which breaks the paradigm by turning me into a door-checker. Fair enough, but if the latter situation, it would have been polite for them to have sent me an email saying "you're fired". My understanding of Slony's switch/failover (please correct it if it's wrong!) is that: 1) Switchover with MOVE SET seems to require the cooperation of the old origin at the time of the switchover. If I try to do: lock set (id = 1, origin = 1); wait for event (origin = 1, confirmed = 2); move set (id = 1, old origin = 1, new origin = 2); wait for event (origin = 1, confirmed = 2); then I need node 1 to be alive before the transfer is complete. 2) Failover with FAILOVER makes no attempt to contact the old origin node because it's assumed that the old origin is incommunicado. It just grabs the set. Isn't there a useful middle ground? A failover that includes an asynchronous message for the old origin to say "you're fired", or more precisely, "you are no longer the origin for set id = 1". Perhaps I'm missing the point. Perhaps MOVE SET does exactly that(?) That won't fix a situation where a client can connect to node 1 and node 2 but node 1 has no connectivity to node 2. So when node 1 wakes it starts serving the client before it gets the "you're fired" message. But for situations where node 1 has simply been offline for a while, which I think are likely to be more common, it would make sense to be able to tell it of its situation when it wakes. Or can I do that simply by preceding my FAILOVER instruction with a MOVE SET or even a DROP NODE, which node 1 will receive when it wakes? Julian
- Previous message: [Slony1-general] Working out who is master after failover
- Next message: [Slony1-general] Working out who is master after failover
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list