Tue Jan 25 18:12:09 PST 2005
- Previous message: [Slony1-general] Backing up a PostgreSQL database that is being replicated using Slony
- Next message: [Slony1-general] strange failover problem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Dear Slony enthusiasts,
I am trying to set up a replication system consisting of 3 nodes (I am using slony-I 1.0.5 and postgresql 7.4.5). I start the nodes one by one: first node 1 - the master (init cluster, create set and so on - no problems there), then nodes 2 and 3 - slaves (just store node, add store paths and listens and subscribe set - no problems there as well). When I have all three nodes up and running, the sl_listen table on each node looks more or less as follows:
li_origin | li_provider | li_receiver
-----------+------------+------------
2 | 2 | 1
1 | 1 | 2
3 | 3 | 1
1 | 1 | 3
2 | 2 | 3
3 | 3 | 2
(sl_path looks similar - separate paths between each two nodes - 6 paths altogehter). As you can see, each node listens for events directly on every other node. (the order of the entries differs on each node, but that does not matter, I guess?) the sl_subscribe table:
sub_set | sub_provider | sub_receiver | sub_forward | sub_active
---------+-------------+-------------+------------+-----------
1 | 1 | 2 | t | t
1 | 1 | 3 | t | t
- which means nodes 2 and 3 are direct receivers of set 1 originating on node 1. The data is replicated fine, switchovers are performed smoothly and everything seems to be ok.
The big problem is failover. I tried to do the simplest thing - failover to node 3. My failover script is very simple:
cluster name = clusterix;
node 1 admin conninfo = 'dbname=clusterix1 hostaddr=127.0.0.1 user=clusterix ';
node 2 admin conninfo = 'dbname=clusterix2 hostaddr=127.0.0.1 user=clusterix ';
node 3 admin conninfo = 'dbname=clusterix3 hostaddr=127.0.0.1 user=clusterix ';
try {
failover (id = 1, backup node = 3);}
on error {
echo 'failover error';
exit 13;
}
The failover command succeeds. On node 2 the sl_subscribe and sl_set tables are changed to:
sub_set | sub_provider | sub_receiver | sub_forward | sub_active
---------+-------------+-------------+------------+-----------
1 | 3 | 2 | t | t
and
set_id | set_origin | set_locked | set_comment
--------+-----------+-----------+---------------------
1 | 3 | | All clusterix tables
, which is exactly what I'd expect. But on node 3, which was supposed to become my new master node, these tables look somewhat strange:
sub_set | sub_provider | sub_receiver | sub_forward | sub_active
---------+-------------+-------------+------------+-----------
1 | 2 | 3 | t | t
1 | 3 | 2 | t | t
and
set_id | set_origin | set_locked | set_comment
--------+-----------+-----------+---------------------
1 | 1 | | All clusterix tables
As you can see, according to sl_subscribe, node 3 is the provider AND the receiver of node 2, and node 2 is the provider AND the receiver of node 3 - which makes no sense to me... Not to mention the origin of set 1, acc. to sl_set is still node 1, which had failed.... In the end I can neither write anything to my database on node 3 (slony thinks it is being replicated), nor drop node 1 (slony tells me it is still the origin of set 1). So the big question is:
WHAT AM I DOING WRONG? While I was investigating the problem I found out that the sl_event table on node 3 does not contain the FAILOVER_SET event (which is present on node 2). I tried to dwell deeper into the contents of slony tables, but found no clues. Now, to make things even more awkward: When I start up all the three nodes (exactly as I described at the beginning), switchover to node 3 (works fine) and THEN failover to node 1 - it works! I reckon that it is because node 1 was started first, but I found no differences in the contents of slony tables that would clearly explain such behaviour. If will really appreciate any help in solving this problem,
Chris Bandurski
chris at gv.pl
- Previous message: [Slony1-general] Backing up a PostgreSQL database that is being replicated using Slony
- Next message: [Slony1-general] strange failover problem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list