bugzilla-daemon at main.slony.info bugzilla-daemon at main.slony.info
Fri Jun 21 09:03:13 PDT 2013
http://www.slony.info/bugzilla/show_bug.cgi?id=296

           Summary: FAILOVER doesn't work when non-origin nodes fail at
                    the same time
           Product: Slony-I
           Version: devel
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: enhancement
          Priority: low
         Component: slonik
        AssignedTo: slony1-bugs at lists.slony.info
        ReportedBy: ssinger at ca.afilias.info
                CC: slony1-bugs at lists.slony.info
   Estimated Hours: 0.0


This bug is present in 2.2.0 beta 4

Consider a 6 node cluster with two sets such that the subscribe network for the
two sets is like

 set 1
    1 --------> 2
   /\          /\ 
  3  4        5  6


 set 2 
   3-----------> 5
  /\            /\
 1  4          2  6

If Nodes 1,3,4  all fail at the same time (because they are in the same
location)


---
FAILOVER( node=(id=1,backup node=2), node=(id=3,backup node=5));
---

doesn't work because slonik still thinks that node 4 is part of the cluster and
will wait for it.    

Requiring a user to DROP node 4 before the failover like
--
drop node(id=4, event node = 2);
--

is an issue because slonik will still think that node 1,3 are valid nodes and
if any ev_origin=4 events have made it to some nodes but not others slonik will
wait for that event to be replicated everywhere.

-- 
Configure bugmail: http://www.slony.info/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Slony1-bugs mailing list