Mon Jan 28 14:24:04 PST 2008
- Previous message: [Slony1-general] Issues troubleshooting failing replication - (Master - Slave)
- Next message: [Slony1-general] Slony Nagios plugins - sl_status
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi again, I just added myself to the subscription list... Thanks On 1/28/08, Ouray Viney <ouray at viney.ca> wrote: > > Hi All: > > I have spent most of the day parsing through the documentation, but have > been unsuccessful in solving the problem. > > Environment setup: > > 1 master (host=3Dpgmaster, db=3Dssm) > 1 slave (host=3Dpgslave1, db=3Dssm_slave) > > Commands run on the master: > > Note: I will not include the db/user/pgplsql commands > > 1. run the schema script > > [root at caotta000381 scripts]# psql -U ssm -d ssm -f ~/sessionmanager- > postgresql-create.sql > > 2. initialize the cluster (run on master node) > > [root at caotta000381 ~]# slonik_init_cluster > > # INIT CLUSTER > cluster name =3D ssm_cluster; > node 1 admin conninfo=3D'host=3Dpgmaster dbname=3Dssm user=3Dpostgres po= rt=3D5432'; > node 2 admin conninfo=3D'host=3Dpgslave1 dbname=3Dssm_slave user=3Dpostg= res > port=3D5432'; > init cluster (id =3D 1, comment =3D 'Node 1 - ssm at pgmaster'); > > # STORE NODE > store node (id =3D 2, event node =3D 1, comment =3D 'Node 2 - > ssm_slave at pgslave1'); > echo 'Set up replication nodes'; > > # STORE PATH > echo 'Next: configure paths for each node/origin'; > store path (server =3D 1, client =3D 2, conninfo =3D 'host=3Dpgmaster d= bname=3Dssm > user=3Dpostgres port=3D5432'); > store path (server =3D 2, client =3D 1, conninfo =3D 'host=3Dpgslave1 > dbname=3Dssm_slave user=3Dpostgres port=3D5432'); > echo 'Replication nodes prepared'; > echo 'Please start a slon replication daemon for each node'; > [root at caotta000381 ~]# slonik_init_cluster | slonik > <stdin>:10: Set up replication nodes > <stdin>:13: Next: configure paths for each node/origin > <stdin>:16: Replication nodes prepared > <stdin>:17: Please start a slon replication daemon for each node > > 3. Start slon process > > [root at caotta000381 ~]# slon_start 1 > Invoke slon for node 1 - /usr/bin/slon -s 1000 -d2 ssm_cluster > 'host=3Dpgmaster dbname=3Dssm user=3Dpostgres port=3D5432' 2>&1 > > /var/log/slony/slony1/node1/ssm-2008-01-28_13:47:29.log & > Slon successfully started for cluster ssm_cluster, node node1 > PID [12806] > Start the watchdog process as well... > [root at caotta000381 ~]# slon_start 2 > Invoke slon for node 2 - /usr/bin/slon -s 1000 -d2 ssm_cluster > 'host=3Dpgslave1 dbname=3Dssm_slave user=3Dpostgres port=3D5432' 2>&1 > > /var/log/slony/slony1/node2/ssm_slave-2008-01-28_13:47:38.log & > Slon successfully started for cluster ssm_cluster, node node2 > PID [12849] > Start the watchdog process as well... > [root at caotta000381 ~]# > > 4. build skeleton for slon_tools.conf > > build the skeleton script for slon_tools.conf > > slonik_build_env -node pgmaster:ssm:postgres -node > pgslave1:ssm_slave:postgres > > [root at caotta000381 pgsql]# slonik_build_env -node pgmaster:ssm:postgres > -node pgslave1:ssm_slave:postgres > &add_node(host =3D> 'pgmaster', dbname =3D> 'ssm', port =3D>5432, > user=3D>'postgres', password=3D>'', node=3D>1 ); > &add_node(host =3D> 'pgslave1', dbname =3D> 'ssm_slave', port =3D>5432, > user=3D>'postgres', password=3D>'', node=3D>2 , parent=3D>1); > @KEYEDTABLES=3D( > "public.managednumberrouteset", > "public.mediationstrategy", > "public.nonmanagednumbestfitrouteset", > "public.normalization", > "public.normalizationtype", > "public.numberforwarding", > "public.outboundrouteorder", > "public.outboundrouteset", > "public.route", > "public.routetype", > "public.routetype_normalization", > ); > @SEQUENCES=3D( > "public.nonmanagednumbestfitrouteset_outboundroutelogicid_seq", > "public.normalization_normalizationid_seq", > ); > > 5. Add that data to the slon_tools.conf > > 6. Create the replication sets > > [root at caotta000381 etc]# slonik_create_set 1 > cluster name =3D ssm_cluster; > node 1 admin conninfo=3D'host=3Dpgmaster dbname=3Dssm user=3Dpostgres po= rt=3D5432'; > node 2 admin conninfo=3D'host=3Dpgslave1 dbname=3Dssm_slave user=3Dpostg= res > port=3D5432'; > > # TABLE ADD KEY > > # CREATE SET > try { > create set (id =3D 1, origin =3D 1, comment =3D 'Set 1 for ssm_cluste= r'); > } on error { > echo 'Could not create subscription set 1 for ssm_cluster!'; > exit -1; > } > > # SET ADD TABLE > echo 'Subscription set 1 created'; > echo 'Adding tables to the subscription set'; > set add table (set id =3D 1, origin =3D 1, id =3D 1, > full qualified name =3D ' > public.nonmanagednumbestfitrouteset', key=3D'public.normalization', > comment =3D 'Table public.nonmanagednumbestfitrouteset w= ith > candidate primary key public.normalization'); > echo 'Add candidate primary keyed table > public.nonmanagednumbestfitrouteset'; > set add table (set id =3D 1, origin =3D 1, id =3D 2, > full qualified name =3D 'public.outboundrouteorder', key= =3D' > public.outboundrouteset', > comment =3D 'Table public.outboundrouteorder with candid= ate > primary key public.outboundrouteset'); > echo 'Add candidate primary keyed table public.outboundrouteorder'; > set add table (set id =3D 1, origin =3D 1, id =3D 3, > full qualified name =3D 'public.normalizationtype', key= =3D' > public.numberforwarding', > comment =3D 'Table public.normalizationtype with candida= te > primary key public.numberforwarding'); > echo 'Add candidate primary keyed table public.normalizationtype'; > set add table (set id =3D 1, origin =3D 1, id =3D 4, > full qualified name =3D 'public.route', key=3D' > public.routetype', > comment =3D 'Table public.route with candidate primary k= ey > public.routetype'); > echo 'Add candidate primary keyed table public.route'; > set add table (set id =3D 1, origin =3D 1, id =3D 5, > full qualified name =3D 'public.managednumberrouteset', > key=3D'public.mediationstrategy', > comment =3D 'Table public.managednumberrouteset with > candidate primary key public.mediationstrategy'); > echo 'Add candidate primary keyed table public.managednumberrouteset'; > set add table (set id =3D 1, origin =3D 1, id =3D 6, > full qualified name =3D 'public.routetype_normalization', > key=3D'', > comment =3D 'Table public.routetype_normalization with > candidate primary key '); > echo 'Add candidate primary keyed table public.routetype_normalization'; > > # SET ADD SEQUENCE > echo 'Adding sequences to the subscription set'; > set add sequence (set id =3D 1, origin =3D 1, id =3D 1, > full qualified name =3D ' > public.nonmanagednumbestfitrouteset_outboundroutelogicid_seq', > comment =3D 'Sequence > public.nonmanagednumbestfitrouteset_outboundroutelogicid_seq'); > echo 'Add sequence > public.nonmanagednumbestfitrouteset_outboundroutelogicid_seq'; > set add sequence (set id =3D 1, origin =3D 1, id =3D 2, > full qualified name =3D ' > public.normalization_normalizationid_seq', > comment =3D 'Sequence > public.normalization_normalizationid_seq'); > echo 'Add sequence public.normalization_normalizationid_seq'; > echo 'All tables added'; > > [root at caotta000381 etc]# slonik_create_set 1 | slonik > <stdin>:16: Subscription set 1 created > <stdin>:17: Adding tables to the subscription set > <stdin>:18: PGRES_FATAL_ERROR select > "_ssm_cluster".determineIdxnameUnique('public.nonmanagednumbestfitroutese= t', > 'public.normalization'); - ERROR: Slony-I: table > "public"."nonmanagednumbestfitrouteset" has no unique index > public.normalization > > Note: I ignored the error and moved on. Perhaps this was a bad thing? > > 7. Subscribed the nodes to set > > [root at caotta000381 etc]# slonik_subscribe_set 1 2 > cluster name =3D ssm_cluster; > node 1 admin conninfo=3D'host=3Dpgmaster dbname=3Dssm user=3Dpostgres po= rt=3D5432'; > node 2 admin conninfo=3D'host=3Dpgslave1 dbname=3Dssm_slave user=3Dpostg= res > port=3D5432'; > try { > subscribe set (id =3D 1, provider =3D 1, receiver =3D 2, forward =3D = yes); > } > on error { > exit 1; > } > echo 'Subscribed nodes to set 1'; > > [root at caotta000381 etc]# slonik_subscribe_set 1 2 | slonik > <stdin>:10: Subscribed nodes to set 1 > > 8. When using the altperl scripts to setup the cluster, it isn't clear > what tasks need to be performed on the slave node. From what I read, the > only thing that needs to be done on the slave nodes it what I read here: > http://linuxfinances.info/info/addthings.html section: (12.4. Adding A > Node To Replication). > > 9. Since I was a little confused I ran the following on the slave node > assuming that I was doing the right thing (in accordance with the above > reference): > > I ran this on the slave node: > > cluster name =3D ssm_cluster; > node 1 admin conninfo=3D'host=3Dpgmaster dbname=3Dssm user=3Dpostgres po= rt=3D5432'; > node 2 admin conninfo=3D'host=3Dpgslave1 dbname=3Dssmslave user=3Dpostgr= es > port=3D5432'; > try { > subscribe set (id =3D 1, provider =3D 1, receiver =3D 2, forward =3D = yes); > } > on error { > exit 1; > } > echo 'Subscribed nodes to set 1'; > > 10. I inserted some data into the master db. > > 11. Checked the slon logs and didn't see any inserts getting replicated, > so I knew something was wrong. > > 12. I ran the check_slony_cluster.sh on both nodes: > > Slave: > -bash-3.00$ ./check_slony_cluster.sh ssm_cluster ssm_slave pgslave1 > ERROR - 1 of 1 nodes not in sync > > Master: > -bash-3.00$ ./check_slony_cluster.sh ssm_cluster ssm pgmaster > OK - 1 nodes in sync > > Running yet another replication test script: > > -bash-3.00$ ./psql_replication_check.pl -h pgslave1 -d ssm_slave -p 5432 > -U postgres > ERROR: relation "replication_status" does not exist > -bash-3.00$ ./psql_replication_check.pl -h pgmaster -d ssm -p 5432 -U ssm > ERROR: relation "replication_status" does not exist > > At this point I have no idea were the problem lies. I can guess my way > through based on the commands that I have run. But at this point I am not > seeing any replication nor any sign after manipulating the data that any > events were triggering replication. > > Here is a snippet of the slon log for the slave node: > > <snippet> > 2008-01-28 16:57:05 EST DEBUG2 remoteWorkerThread_2: forward confirm > 1,1140 received by 2 > 2008-01-28 16:57:13 EST DEBUG2 syncThread: new sl_action_seq 1 - SYNC 1141 > 2008-01-28 16:57:13 EST DEBUG2 localListenThread: Received event 1,1141 > SYNC > 2008-01-28 16:57:16 EST DEBUG2 remoteListenThread_2: queue event 2,457 > SYNC > 2008-01-28 16:57:16 EST DEBUG2 remoteWorkerThread_2: Received event 2,457 > SYNC > 2008-01-28 16:57:16 EST DEBUG2 remoteWorkerThread_2: SYNC 457 processing > 2008-01-28 16:57:16 EST DEBUG2 remoteWorkerThread_2: no sets need syncing > for this event > 2008-01-28 16:57:16 EST DEBUG2 remoteWorkerThread_2: forward confirm > 1,1141 received by 2 > 2008-01-28 16:57:23 EST DEBUG2 syncThread: new sl_action_seq 1 - SYNC 1142 > 2008-01-28 16:57:23 EST DEBUG2 localListenThread: Received event 1,1142 > SYNC > 2008-01-28 16:57:27 EST DEBUG2 remoteListenThread_2: queue event 2,458 > SYNC > 2008-01-28 16:57:27 EST DEBUG2 remoteWorkerThread_2: Received event 2,458 > SYNC > 2008-01-28 16:57:27 EST DEBUG2 remoteWorkerThread_2: SYNC 458 processing > 2008-01-28 16:57:27 EST DEBUG2 remoteWorkerThread_2: no sets need syncing > for this event > 2008-01-28 16:57:27 EST DEBUG2 remoteWorkerThread_2: forward confirm > 1,1142 received by 2 > 2008-01-28 16:57:31 EST DEBUG2 remoteListenThread_2: queue event 2,459 > SYNC > 2008-01-28 16:57:31 EST DEBUG2 remoteWorkerThread_2: Received event 2,459 > SYNC > 2008-01-28 16:57:31 EST DEBUG2 remoteWorkerThread_2: SYNC 459 processing > 2008-01-28 16:57:31 EST DEBUG2 remoteWorkerThread_2: no sets need syncing > for this event > </snippet> > > I also did a simple query on the table that I had added data to on both > nodes: > > master: > =3D=3D=3D=3D=3D=3D=3D=3D=3D > ssm=3D# select * from outboundrouteset; > outboundroutesetid | comments > ------------------------+------------------ > ATP_CMA_ROUTESET_ID_1 | ATP_TEST_COMMENT > ATP_CMB_ROUTESET_ID_1 | ATP_TEST_COMMENT > ATP_PSTN_ROUTESET_ID_1 | ATP_TEST_COMMENT > (3 rows) > > slave: > =3D=3D=3D=3D=3D=3D=3D=3D=3D > ssm_slave=3D# select * from outboundrouteset; > outboundroutesetid | comments > --------------------+---------- > (0 rows) > > From what I can tell, no sign of any replication occurring, which I would > expect to look something like this: > > <snippet> > 2007-10-31 06:57:04 EDT DEBUG2 remoteHelperThread_1_1: inserts=3D530 > updates=3D530 deletes=3D0 > 2007-10-31 06:57:04 EDT DEBUG2 remoteHelperThread_1_1: inserts=3D211 > updates=3D211 deletes=3D0 > 2007-10-31 06:57:05 EDT DEBUG2 remoteHelperThread_1_1: inserts=3D197 > updates=3D197 deletes=3D0 > 2007-10-31 06:57:06 EDT DEBUG2 remoteHelperThread_1_1: inserts=3D255 > updates=3D255 deletes=3D0 > 2007-10-31 06:57:07 EDT DEBUG2 remoteHelperThread_1_1: inserts=3D116 > updates=3D116 deletes=3D0 > </snippet> > > Any help in diagnosing replication failures would be greatly appreciated. > > Kind Rgds, > > Ouray > > > -- > Ouray Viney > https://www.viney.ca -- = Ouray Viney https://www.viney.ca -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.slony.info/pipermail/slony1-general/attachments/20080128/= ba5ae59d/attachment-0001.htm
- Previous message: [Slony1-general] Issues troubleshooting failing replication - (Master - Slave)
- Next message: [Slony1-general] Slony Nagios plugins - sl_status
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list