Jan Wieck JanWieck at Yahoo.com
Thu Feb 17 07:34:26 PST 2011
On 2/17/2011 10:03 AM, Tech Madhu wrote:
> all,
> i got the failover to work.. listing the steps which might be useful for
> other beginners like me
>
> node1 = master, node2=slave
>
> node1 goes down (crash or power failure). while node1 is being
> recovered, if your apps have to continue writing to DB, only way i found
> is that we have to do the failover
>
> on node 2, run the following slonik commands
>       1) failover (id = 1, backup node = 2); (this works only if you
> subscribed your set originally with forward=yes)
>       2) ensure DB can be written into (not readonly anymore)
>       3) drop node (id = 1, event node = 2);
> on node 1, run the following commands once it comes back in service
>        1) dropdb <dbname>
>         2) createdb -O <dbuser> <dbname>
>         3) psql -U user -d DB < backup.sql
>        //Note : this backup.sql should not have slony tables, so its
> backup taken before slony was setup using pg_dump -c -s -U <user> <db> >
> backup.sql
>          4) Run slonik commands to store node (1) and store path
>                cluster name = clus_name;
>                node 1 admin conninfo = 'dbname=yourdb host=yournode1host
> user=dbuser password=pass';
>                node 2 admin conninfo = 'dbname=yourdb host=yournode2host
> user=dbuser  password=pass';
>                store node (id = 1, comment = 'Node 1 slave', event node=2);

Please note that we may make node ids non-reusable in a future Slony 
version. Creating a node with the same id that a formerly dropped node 
had caused some adverse side effects in the past, in case the Slony 
configuration had not gotten rid of any traces for the old node 
everywhere else.


Jan

>                store path (server = 1, client = 2,
>                    conninfo = 'dbname=yourdb host=node1host user=repusr
> password=pass');
>                store path (server = 2, client = 1,
>                   conninfo = 'dbname=yourdb host=node2host user=dbuser
> password=pass');
>           5)Start slony daemon on node 1
>           6) subscribe to your sets
>               subscribe set (id = 1, provider = 2, receiver = 1, forward
> = yes);
>               Note above, i am showing the provider is node 2 and
> receiver is node 1 (opposite of my initial subscribe)
>          7) your replication should be working..
>
>
> On Tue, Feb 15, 2011 at 9:52 PM, Tech Madhu <technimadhu at gmail.com
> <mailto:technimadhu at gmail.com>> wrote:
>
>     Thanks for your reply.
>
>     if the orig master is down say due to some hardware issue (for few
>     hours/days say), we have to get the system on the slave up (we
>     accept the loss of some N txn)
>
>     In this case, Is the following pseudcode correct. Assuming before
>     crash of master, my master was node:1 and slave node:2
>         On the slave (node 2),
>                a) i run failover command (failover (id=1, backup node = 2)
>                b) Run drop node command of node (1)
>         When the orig master is ready to be brought back in service can
>     i re-use the node (1) for it?
>         if so , is it enough to run just the following 2 commands on the
>     original master
>             store node (id = 1, event node = 2);
>             store path (server=2, client=1, conninfo='connection info to
>     node2')
>
>
>     On Tue, Feb 15, 2011 at 2:52 PM, Jan Wieck <JanWieck at yahoo.com
>     <mailto:JanWieck at yahoo.com>> wrote:
>
>         On 2/15/2011 2:44 PM, Jan Wieck wrote:
>
>             This is NOT possible given the Slony-I design.
>
>             Slony-I is an asynchronous replication system. That means
>             that changes
>             to the origin are replicated some time AFTER they have been
>             committed.
>             That means that if the origin goes down unexpectedly, you
>             have no chance
>             of knowing what changes did not propagate to the replica
>             before it crashed.
>
>             The only way to solve this situation is to actually do a
>             hard FAILOVER,
>             abandoning the old origin and rebuilding it from scratch.
>
>             To illustrate, think about a simple foreign key constraint,
>             t2.fk <http://t2.fk>
>             references t1.pk <http://t1.pk>. There currently are no rows
>             in t2 referencing a
>             certain t1.pk <http://t1.pk>, so node:1 will allow to DELETE
>             it. Node:1 crashes before
>             the DELETE can propagate to node:2. You failover to node:2
>             and since it
>             still has the t1 row, it will happily allow you to INSERT
>             references to
>             it into t1. Now you bring back node:1 and ... how exactly do
>             you get the
>
>
>         into t2, of course.
>
>
>             two to agree what is right? Will you forcefully remove the
>             rows, node:2
>             inserted into t2 or will you recreate the t1 row in node:1
>             so that the
>             INSERT's can propagate from node:2 to node:1?
>
>
>             Jan
>
>
>
>         --
>         Anyone who trades liberty for security deserves neither
>         liberty nor security. -- Benjamin Franklin
>
>
>
>
>
> _______________________________________________
> Slony1-general mailing list
> Slony1-general at lists.slony.info
> http://lists.slony.info/mailman/listinfo/slony1-general


-- 
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


More information about the Slony1-general mailing list