Quinn Jones quinn_jones at pobox.com
Wed Jul 16 09:48:02 PDT 2008
Hello,

I've been lurking on the list for a while, but now have a problem that I
haven't seen:  I tried to failover a node, but a slave was also unresponsive
and slonik errored out after timing out (so the failover didn't happen).

Here's our set-up: We have a database replicated to three slave nodes and a
total of three sites, like this
site 1: db1 (master) and db2
site 2: db3
site 3: db4

Our problem started when site 1 went away completely and abruptly (so db1
and db2 were out of commission).  Our plan called for failing the database
over to db3.  When I tried to failover, though, slonik timed out with the
message 'could not connect to server: Connection timed out.  Is the server
running on host "x.x.x.x" and accepting TCP/IP connections on port 5432?'.
The ip address was db2, so seeing that there is a logical problem to solve I
tried dropping the downed slave node first.  This timed out as well, and the
slave was not dropped.

While trying to figure out an intelligent next step, short of dropping
replication entirely and just using db3 stand-alone (and rebuilding the
cluster from scratch later) site1 mostly came back up.  We lucked out and in
the end saved some time by not being able to fail over the way we wanted,
though we did lose an unknown number of sales because we were effectively
down.

How do we drop a non-responsive slave, or force the failover to ignore it?
This is a situation that shouldn't come up frequently for us, but it could
and this was rather troublesome.  I understand why failover would want to
communicate with every other server, but there must be a way to step over
other dead servers to get a functional cluster (I just haven't found it
yet).  Also, shouldn't dropping a slave node happen whether the node can be
seen or not?

Quinn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20080716/=
f5b56b49/attachment.htm


More information about the Slony1-general mailing list