Steve Singer ssinger_pg at sympatico.ca
Sat May 22 19:48:45 PDT 2010
On Fri, 21 May 2010, Sam Nelson wrote:


It would be useful to know

1) What version of slony are you using (I'm guessing 1.2.x since other folks 
from consistent state have mentioned that)

2) What paths have you created.

'could not connect to server: Connection refused' makes me think that you 
have a typo in a hostname or port somewhere possibly when you did a store 
path.  Before doing the failover it would be worth while to look at the 
sl_path table on each of your nodes to make sure it looks as you expect 
(sometimes when slonik scripts are generated through shell scripts an error 
in the shell substitution can mean that the slonik script that got executed 
isn't exactly what you through you were executing)

The error you quote below from RebuildListenEntries() indicates that slony 
is trying to build the sl_listen entries based on the contents of sl_path 
but is having a problem (a null value).  I would double check what sl_path 
has as a next step.


> I'm trying to run a failover on a three node cluster (for testing 
> purposes) and it doesn't seem to be working, no matter how I try it.
>
> I've tried running the following in slonik:
>
> node 1 admin conninfo = 'dbname=$dbname host=$host1 port=$port user=$user 
> '; node 2 admin conninfo = 'dbname=$dbname host=$host2 port=$port 
> user=$user '; node 3 admin conninfo = 'dbname=$dbname host=$host3 
> port=$port user=$user ';
>
> echo 'Failing over...';
>
> failover (id = 1, backup node = 2);
> echo 'Dropping node 1...';
> drop node (id = 1, event node = 2);
> echo 'Failover complete';
>
> I have also tried to (as per the "Failover With Complex Node 
> Set" instructions) run subscribe set to update the subscription info 
> for other nodes before failing over to node 2, but the subscribe set 
> command fails with "could not connect to server: Connection 
> refused" (even though none of the nodes used in the subscribe set 
> command are the master node).  So I went back to just running failover and 
> letting the failover function take care of subscribing nodes and junk.
>
> The results have been ... well, they have been sort of random.  It does 
> occasionally seem to report a successful run, but even then, node 3 
> usually has some incorrect information about the new structure of the 
> cluster.  The most common ocurrance (and the only one I have logs for), 
> though, is that I receive the following output from the above slonik 
> commands:
>
>> ;stdin;stdin;stdin



More information about the Slony1-general mailing list