Thu Mar 25 08:41:26 PDT 2010
- Previous message: [Slony1-general] Diagnosing a possible problem with replication
- Next message: [Slony1-general] [slony1-general] initial copy incomplete when using 2.0.3rc2
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi, Steve! Thanks a lot for your help! I did what you told me, and after that I noticed something interesting: this was not the very first time Slony was installed on the machines!!! Of course, after uninstalling, nobody did the "DROP SCHEMA _dbprod_cluster CASCADE". I did that, reinstalled the cluster, and now everything is working fine! Thanks again for your help, and best regards, HeCSa. On Mon, Mar 22, 2010 at 11:28 PM, Steve Singer <ssinger at ca.afilias.info>wrote: > Hernan Saltiel wrote: > >> Hi! >> I configured a slony cluster between two nodes: the master, srvdb01, and a >> slave, srvdb02. The database is "dbprod". >> Both nodes are CentOS 64 bits, with this postgres packages installed: >> create set (id = 1, origin = 1, >> comment = 'Base Productiva'); >> >> (All the set's are here, are more than 120...) >> >> > You never mentioned where you added tables to your sets. Could you have > 120 replication sets with 0 tables in each? > > Does > SELECT * FROM _mycluster.sl_table; > > show you anything interesting? (is it empty, meaning your sets seem to have > no tables?) > > Did you also issue 120 subscribe set requests or did you only subscribe the > first one? (If you tried subscribing all 120 at once you might want to try > and tear-down the slony cluster and try it again only doing the first set > and waiting for it to finish before moving on. It is possible there are > some race conditions that result from trying to subscribe to multiple sets > concurrently) > > You should also check to see if there are any locks being held on slony > tables. > > > > > > store node (id = 2, comment = 'Node 2'); >> store path (server = 1, client = 2, >> conninfo = 'dbname=$DB1 host=$H1 user=$U password=$P'); >> store path (server = 2, client = 1, >> conninfo = 'dbname=$DB2 host=$H2 user=$U password=$P'); >> store listen (origin = 1, provider = 1, receiver = 2); >> store listen (origin = 2, provider = 2, receiver = 1); >> >> Then, executed the script. >> >> On the master and slave nodes, I ran: >> nohup slon dbprod_cluster "dbname=dbprod user=postgres" & >> >> After that, created the subscribe.sh script, on the slave node: >> >> #!/bin/sh >> >> CLUSTER=dbprod_cluster >> DB1=dbprod >> DB2=dbprod >> H1=srvdb01 >> H2=srvdb02 >> U=postgres >> P=Secreta01 >> >> slonik <<_EOF_ >> >> cluster name = $CLUSTER; >> >> node 1 admin conninfo = 'dbname=$DB1 host=$H1 user=$U password=$P'; >> node 2 admin conninfo = 'dbname=$DB2 host=$H2 user=$U password=$P'; >> >> subscribe set (id = 1, provider = 1, receiver = 2, forward = yes); >> >> I ran that script, and saw in the nohup.out log file of the slon process >> several SYNC, LISTEN and UNLISTEN messages. >> I'm concerned, after two days seeing those messages, and not seeing any >> row being replicated, if this is normal, because Slony needs to do something >> before start replicating, or if there is some way to understand if something >> is going wrong. >> >> Here are some rows of the master nohup.out file: >> >> DEBUG2 remoteWorkerThread_2: SYNC 30755 processing >> DEBUG2 remoteWorkerThread_2: no sets need syncing for this event >> DEBUG2 syncThread: new sl_action_seq 11392 - SYNC 16232 >> DEBUG2 remoteListenThread_2: queue event 2,30756 SYNC >> DEBUG2 remoteListenThread_2: queue event 2,30757 SYNC >> DEBUG2 remoteWorkerThread_2: Received event 2,30756 SYNC >> DEBUG2 calc sync size - last time: 1 last length: 8611 ideal: 6 proposed >> size: 3 >> DEBUG2 remoteWorkerThread_2: SYNC 30757 processing >> DEBUG2 remoteWorkerThread_2: no sets need syncing for this event >> DEBUG2 localListenThread: Received event 1,16232 SYNC >> DEBUG2 syncThread: new sl_action_seq 11392 - SYNC 16233 >> DEBUG2 remoteListenThread_2: queue event 2,30758 SYNC >> DEBUG2 remoteWorkerThread_2: Received event 2,30758 SYNC >> DEBUG2 calc sync size - last time: 2 last length: 8525 ideal: 14 proposed >> size: 5 >> DEBUG2 remoteWorkerThread_2: SYNC 30758 processing >> DEBUG2 remoteWorkerThread_2: no sets need syncing for this event >> DEBUG2 remoteListenThread_2: queue event 2,30759 SYNC >> DEBUG2 remoteWorkerThread_2: Received event 2,30759 SYNC >> DEBUG2 calc sync size - last time: 1 last length: 2389 ideal: 25 proposed >> size: 3 >> DEBUG2 remoteWorkerThread_2: SYNC 30759 processing >> DEBUG2 remoteWorkerThread_2: no sets need syncing for this event >> DEBUG2 localListenThread: Received event 1,16233 SYNC >> DEBUG2 syncThread: new sl_action_seq 11392 - SYNC 16234 >> DEBUG2 localListenThread: Received event 1,16234 SYNC >> DEBUG2 remoteListenThread_2: queue event 2,30760 SYNC >> DEBUG2 remoteListenThread_2: queue event 2,30761 SYNC >> DEBUG2 remoteWorkerThread_2: Received event 2,30760 SYNC >> DEBUG2 calc sync size - last time: 1 last length: 8570 ideal: 7 proposed >> size: 3 >> DEBUG2 remoteWorkerThread_2: SYNC 30761 processing >> DEBUG2 remoteWorkerThread_2: no sets need syncing for this event >> DEBUG2 syncThread: new sl_action_seq 11392 - SYNC 16235 >> DEBUG2 remoteListenThread_2: queue event 2,30762 SYNC >> DEBUG2 remoteWorkerThread_2: Received event 2,30762 SYNC >> DEBUG2 calc sync size - last time: 2 last length: 8519 ideal: 14 proposed >> size: 5 >> DEBUG2 remoteWorkerThread_2: SYNC 30762 processing >> DEBUG2 remoteWorkerThread_2: no sets need syncing for this event >> DEBUG2 remoteListenThread_2: queue event 2,30763 SYNC >> DEBUG2 remoteWorkerThread_2: Received event 2,30763 SYNC >> DEBUG2 calc sync size - last time: 1 last length: 2350 ideal: 25 proposed >> size: 3 >> DEBUG2 remoteWorkerThread_2: SYNC 30763 processing >> DEBUG2 remoteWorkerThread_2: no sets need syncing for this event >> DEBUG2 localListenThread: Received event 1,16235 SYNC >> >> >> ...and here some of the slave: >> >> DEBUG2 localListenThread: Received event 2,30773 SYNC >> DEBUG2 remoteListenThread_1: LISTEN >> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30774 >> DEBUG2 localListenThread: Received event 2,30774 SYNC >> DEBUG2 remoteListenThread_1: queue event 1,16241 SYNC >> DEBUG2 remoteListenThread_1: UNLISTEN >> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30775 >> DEBUG2 localListenThread: Received event 2,30775 SYNC >> DEBUG2 remoteListenThread_1: LISTEN >> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30776 >> DEBUG2 localListenThread: Received event 2,30776 SYNC >> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30777 >> DEBUG2 remoteListenThread_1: queue event 1,16242 SYNC >> DEBUG2 remoteListenThread_1: UNLISTEN >> DEBUG2 localListenThread: Received event 2,30777 SYNC >> DEBUG2 remoteListenThread_1: LISTEN >> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30778 >> DEBUG2 localListenThread: Received event 2,30778 SYNC >> DEBUG2 remoteListenThread_1: queue event 1,16243 SYNC >> DEBUG2 remoteListenThread_1: UNLISTEN >> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30779 >> DEBUG2 localListenThread: Received event 2,30779 SYNC >> DEBUG2 remoteListenThread_1: LISTEN >> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30780 >> DEBUG2 localListenThread: Received event 2,30780 SYNC >> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30781 >> DEBUG2 remoteListenThread_1: queue event 1,16244 SYNC >> DEBUG2 remoteListenThread_1: UNLISTEN >> DEBUG2 localListenThread: Received event 2,30781 SYNC >> DEBUG2 remoteListenThread_1: LISTEN >> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30782 >> DEBUG2 localListenThread: Received event 2,30782 SYNC >> DEBUG2 remoteListenThread_1: queue event 1,16245 SYNC >> DEBUG2 remoteListenThread_1: UNLISTEN >> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30783 >> DEBUG2 localListenThread: Received event 2,30783 SYNC >> >> I ran some scripts in the _dbprod_cluster view, because of some tips I >> found on blog's, but don't really know if this is an indicator of something >> going normally, or not. >> Here are some of them: >> >> select count(*) from _dbprod_cluster.sl_log_1; >> >> count >> ------- >> 11392 >> (1 row) >> >> select count(*) from _dbprod_cluster.sl_log_2; >> >> count >> ------- >> 0 >> (1 row) >> >> select st_lag_num_events from _dbprod_cluster.sl_status; >> >> st_lag_num_events >> ------------------- >> 16130 >> (1 row) >> >> Could anybody help me understand what this numbers are telling me? >> Thanks a lot in advance for your help!!!! >> Best regards, >> >> -- >> HeCSa >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Slony1-general mailing list >> Slony1-general at lists.slony.info >> http://lists.slony.info/mailman/listinfo/slony1-general >> > > > -- > Steve Singer > Afilias Canada > Data Services Developer > 416-673-1142 > -- HeCSa -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.slony.info/pipermail/slony1-general/attachments/20100325/95f3ba20/attachment-0001.htm
- Previous message: [Slony1-general] Diagnosing a possible problem with replication
- Next message: [Slony1-general] [slony1-general] initial copy incomplete when using 2.0.3rc2
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list