bugzilla-daemon at main.slony.info bugzilla-daemon at main.slony.info
Tue Dec 14 13:20:45 PST 2010
http://www.slony.info/bugzilla/show_bug.cgi?id=171

--- Comment #2 from Steve Singer <ssinger at ca.afilias.info> 2010-12-14 13:20:45 PST ---
A few initial comments:

You state "If they don't, then presumably I'm a failed node, and I should stop
with a fatal error"

I'm not sure we can presume that.  The problem might be with the provider node
not the failed node.

Consider a cluster of the form

b<--a--->c

If something happens to node a and you want to do a failover from a-->b
then node c might need to learn about cluster changes from node b via slon.
You don't want node c exiting on startup when it could talk to b.

Similarly the case 
a-->b-->c

The provider 'b' might have the problem not node c.  You can't assume that node
c (the local node) is gone.


Your patch also does not check the return code from the connect.  As I read
your patch I think it will mean that if at startup time if slon has a
connection failure to one of its providers then it will not start, though in
the first comment you mention that the feature should be resilient to this.

Maybe it would be better for the remote listener to check the remote database
to see if it has an associated sl_node entry for the local id. If not the the
remote listener should do nothing/sleep and retry periodically.

-- 
Configure bugmail: http://www.slony.info/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


More information about the Slony1-bugs mailing list