[Slony1-general] strange failover problem

Tue Jan 25 18:12:09 PST 2005

Dear Slony enthusiasts, 

I am trying to set up a replication system consisting of 3 nodes (I am using slony-I 1.0.5 and postgresql 7.4.5). I start the nodes one by one: first node 1 - the master (init cluster, create set and so on - no problems there), then nodes 2 and 3 - slaves (just store node, add store paths and listens and subscribe set - no problems there as well). When I have all three nodes up and running, the sl_listen table on each node looks more or less as follows:

li_origin | li_provider | li_receiver
-----------+------------+------------
         2 |           2 |           1
         1 |           1 |           2
         3 |           3 |           1
         1 |           1 |           3
         2 |           2 |           3
         3 |           3 |           2

(sl_path looks similar - separate paths between each two nodes - 6 paths altogehter). As you can see, each node listens for events directly on every other node. (the order of the entries differs on each node, but that does not matter, I guess?) the sl_subscribe table:

 sub_set | sub_provider | sub_receiver | sub_forward | sub_active
---------+-------------+-------------+------------+-----------
       1 |            1 |            2 | t           | t
       1 |            1 |            3 | t           | t

- which means nodes 2 and 3 are direct receivers of set 1 originating on node 1. The data is replicated fine, switchovers are performed smoothly and everything seems to be ok.
The big problem is failover. I tried to do the simplest thing - failover to node 3. My failover script is very simple:

	cluster name = clusterix;

	node 1 admin conninfo = 'dbname=clusterix1 hostaddr=127.0.0.1 user=clusterix ';
	node 2 admin conninfo = 'dbname=clusterix2 hostaddr=127.0.0.1 user=clusterix ';
	node 3 admin conninfo = 'dbname=clusterix3 hostaddr=127.0.0.1 user=clusterix ';
try {

	failover (id = 1, backup node = 3);}
 	on error {
 	echo 'failover error';
 	exit 13;
 }

 The failover command succeeds. On node 2 the sl_subscribe and sl_set tables are changed to:
  sub_set | sub_provider | sub_receiver | sub_forward | sub_active
---------+-------------+-------------+------------+-----------
       1 |            3 |            2 | t           | t
and

set_id | set_origin | set_locked |     set_comment
--------+-----------+-----------+---------------------
      1 |          3 |            | All clusterix tables

, which is exactly what I'd expect. But on node 3, which was supposed to become my new master node, these tables look somewhat strange:

sub_set | sub_provider | sub_receiver | sub_forward | sub_active
---------+-------------+-------------+------------+-----------
       1 |            2 |            3 | t           | t
       1 |            3 |            2 | t           | t

      and

       set_id | set_origin | set_locked |     set_comment
--------+-----------+-----------+---------------------
      1 |          1 |            | All clusterix tables

As you can see, according to sl_subscribe, node 3 is the provider AND the receiver of node 2, and node 2 is the provider AND the receiver of node 3 - which makes no sense to me... Not to mention the origin of set 1, acc. to sl_set is still node 1, which had failed.... In the end I can neither write anything to my database on node 3 (slony thinks it is being replicated), nor drop node 1 (slony tells me it is still the origin of set 1). So the big question is:
WHAT AM I DOING WRONG? While I was investigating the problem I found out that the sl_event table on node 3 does not contain the FAILOVER_SET event (which is present on node 2). I tried to dwell deeper into the contents of slony tables, but found no clues. Now, to make things even more awkward: When I start up all the three nodes (exactly as I described at the beginning), switchover to node 3 (works fine) and THEN failover to node 1 - it works! I reckon that it is because node 1 was started first, but I found no differences in the contents of slony tables that would clearly explain such behaviour. If will really appreciate any help in solving this problem,

Chris Bandurski
chris at gv.pl