Julian Scarfe julian
Thu May 5 20:06:44 PDT 2005
From: "Andrew Sullivan" <ajs at crankycanuck.ca>
>
> The problem is more like Descartes doing all of what you just said --
> you can't be sure that the evil demon isn't fooling with you, either.
> Because you don't know whether all your communication has been cut
> off (because "cut off" is defined by what _others_ see, not what you
> do), the interpretation of "I didn't get an email" is indeterminate:
> it may mean, "they didn't send one," or it might mean, "it was sent
> and I didn't get it."  And without also being a door-checker, network
> monitor, electricity monitor, gaps in space-time checker, and other
> such things, you have no way of finding out.

But the interpretation of "I *did* get an email" is unambiguous.  In other
words, I can never be 100% certain that I am still boss, but I can be 100%
certain that I'm not.

>> Isn't there a useful middle ground?  A failover that includes an
>> asynchronous message for the old origin to say "you're fired", or more
>> precisely, "you are no longer the origin for set id = 1".  Perhaps I'm
>> missing the point.  Perhaps MOVE SET does exactly that(?)
>
> It doesn't, no.  And while everyone would agree, I think, that a
> middle ground would be nice, there's no way to do it that will work,
> and that won't run the risk of the system coming "back online" in the
> disastrous way you're worried about.

Yes, I can see that, now (and my brain hurts...).  You can't be never 100% 
certain that when you plug the old master's network cable back in, the first 
thing that happens isn't the smiling happy face of a client wanting to do 
business, which is indistinguishable for the master from a "normal" 
transaction, and is therefore indistinguishable for the client.  So in an 
ideal world, we would always be careful to make sure that can't happen.

But even without a cast iron guarantee, I still think a best-effort delivery 
of a "you are no longer master" event would be a useful feature that would 
offer minimal data loss, for example by locking the set -- failover is, 
after all, a potentially lossy event anyway.

I'm trying very hard to avoid the need to have yet another pair of servers 
sitting between the clients and the database cluster handling nothing but 
the "who's master" issue.

Julian 




More information about the Slony1-general mailing list