[Slony1-general] Working out who is master after failover

Tue May 3 09:22:20 PDT 2005

>> However, after a failover, there's a problem.  Both nodes think
>> they're the master node according to the criteria above, because the
>> old master hasn't received the news of his demotion.  Is there a
>> simple way of working out on the "abandoned" node that it has indeed
>> been abandoned?

> Consider some scenarios:
>
> 1.  Node #1 is in Ottawa; other nodes are in Toronto.  Failover due to
> persistent network failure.
>
> The network falls over, and we decide that the Ottawa data source must
> be abandoned.  The database host and its database is undamaged, and
> since we had no way to communicate with node #1 that it got stepped on,
> it thinks it's running fine.
>
> Note that in this case, no data was ever corrupted in any way.
>
> Note also that since the network was dead, we had no way to tell node #1
> that it is has been abandoned.
>
> Supposing there are some client machines in Ottawa, they might be able
> to talk to node #1 even after it is abandoned, as they were on the
> subnet there.  Could be trouble...

Yup, that's the scenario I'm worried about.

My application has a somewhat special view of what "corrupted" means.  We 
have a constant stream of write-once-read-many perishable data.  If I lose 
10 minutes of data, the entire database might as well be corrupted, because 
it's incomplete (which is in many ways worse than unavailable).

If I failover from node 1 to node 2 as soon as I've detected a failure of 
connectivity of node 1, node 2 can takeover as master with no data loss.  I 
want my client applications that feed the data in to *realise* that node 2 
is now master and the data can be fed to it, not node 1.  I'd prefer that 
realisation to be stateless -- i.e. they don't have to remember who is 
currently master.

The nightmare is that node 1 suddenly comes back online, and the 
applications start feeding data to node 1 instead of node 2 because node 1 
is then looks like a master node.  I then have *two* incomplete databases!

One workaround would be for the applications feeding the data to check that 
there is one and only one master by executing the query

select "_T1".getlocalnodeid('_T1') =
(select set_origin from"_T1".sl_set where set_id = 1)

on *all* the hosts every time they want to write data.    If more that one 
claims to be master, something's wrong.  But that's a little inefficient 
when the first host they try is usually the master.

Julian