Alain Peyrat aljeux at free.fr
Fri May 4 08:16:07 PDT 2007
Hi,

I tried to change the default value to 100ms to test, but this has no impact on
the lag, it is still increasing. The st_last_received_event_ts does not change.

Here is the logs on the slave host (-d 3):

2007-05-04 17:12:12 CEST CONFIG main: slon version 1.2.6 starting up
2007-05-04 17:12:12 CEST DEBUG2 slon: watchdog process started
2007-05-04 17:12:12 CEST DEBUG2 slon: watchdog ready - pid = 22165
2007-05-04 17:12:12 CEST DEBUG2 slon: worker process created - pid = 22166
2007-05-04 17:12:12 CEST CONFIG main: local node id = 2
2007-05-04 17:12:12 CEST DEBUG2 main: main process started
2007-05-04 17:12:12 CEST CONFIG main: launching sched_start_mainloop
2007-05-04 17:12:12 CEST CONFIG main: loading current cluster configuration
2007-05-04 17:12:12 CEST CONFIG storeNode: no_id=1 no_comment='Node 1'
2007-05-04 17:12:12 CEST DEBUG2 setNodeLastEvent: no_id=1 event_seq=629312
2007-05-04 17:12:12 CEST CONFIG storePath: pa_server=1 pa_client=2
pa_conninfo="dbname=gforge host=xxx user=slonik" pa_connretry=10
2007-05-04 17:12:12 CEST CONFIG storeListen: li_origin=1 li_receiver=2
li_provider=1
2007-05-04 17:12:12 CEST CONFIG storeSet: set_id=1 set_origin=1 set_comment='All
gforge tables'
2007-05-04 17:12:12 CEST WARN   remoteWorker_wakeup: node 1 - no worker thread
2007-05-04 17:12:12 CEST DEBUG2 sched_wakeup_node(): no_id=1 (0 threads + worker
signaled)
2007-05-04 17:12:12 CEST CONFIG storeSubscribe: sub_set=1 sub_provider=1
sub_forward='f'
2007-05-04 17:12:12 CEST WARN   remoteWorker_wakeup: node 1 - no worker thread
2007-05-04 17:12:12 CEST DEBUG2 sched_wakeup_node(): no_id=1 (0 threads + worker
signaled)
2007-05-04 17:12:12 CEST CONFIG enableSubscription: sub_set=1
2007-05-04 17:12:12 CEST WARN   remoteWorker_wakeup: node 1 - no worker thread
2007-05-04 17:12:12 CEST DEBUG2 sched_wakeup_node(): no_id=1 (0 threads + worker
signaled)
2007-05-04 17:12:13 CEST DEBUG2 main: last local event sequence = 713075
2007-05-04 17:12:13 CEST CONFIG main: configuration complete - starting threads
2007-05-04 17:12:13 CEST DEBUG1 localListenThread: thread starts
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=22155
2007-05-04 17:12:13 CEST CONFIG enableNode: no_id=1
2007-05-04 17:12:13 CEST DEBUG1 main: running scheduler mainloop
2007-05-04 17:12:13 CEST DEBUG1 remoteWorkerThread_1: thread starts
2007-05-04 17:12:13 CEST DEBUG1 remoteListenThread_1: thread starts
2007-05-04 17:12:13 CEST DEBUG2 remoteListenThread_1: start listening for event
origin 1
2007-05-04 17:12:13 CEST DEBUG1 cleanupThread: thread starts
2007-05-04 17:12:13 CEST DEBUG1 syncThread: thread starts
2007-05-04 17:12:13 CEST DEBUG1 remoteWorkerThread_1: helper thread for provider
1 created
2007-05-04 17:12:13 CEST DEBUG2 syncThread: new sl_action_seq 1 - SYNC 713076
2007-05-04 17:12:13 CEST DEBUG1 remoteListenThread_1: connected to
'dbname=gforge host=xxx user=slonik'
2007-05-04 17:12:13 CEST DEBUG2 localListenThread: Received event 2,713076 SYNC
2007-05-04 17:12:23 CEST DEBUG2 syncThread: new sl_action_seq 1 - SYNC 713077
2007-05-04 17:12:24 CEST DEBUG2 localListenThread: Received event 2,713077 SYNC
2007-05-04 17:12:33 CEST DEBUG2 syncThread: new sl_action_seq 1 - SYNC 713078
2007-05-04 17:12:33 CEST DEBUG2 localListenThread: Received event 2,713078 SYNC
2007-05-04 17:12:41 CEST DEBUG2 syncThread: new sl_action_seq 1 - SYNC 713079
2007-05-04 17:12:42 CEST DEBUG2 localListenThread: Received event 2,713079 SYNC
2007-05-04 17:12:49 CEST DEBUG2 syncThread: new sl_action_seq 1 - SYNC 713080
2007-05-04 17:12:50 CEST DEBUG2 localListenThread: Received event 2,713080 SYNC
2007-05-04 17:12:58 CEST DEBUG2 syncThread: new sl_action_seq 1 - SYNC 713081
2007-05-04 17:12:58 CEST DEBUG2 localListenThread: Received event 2,713081 SYNC
2007-05-04 17:13:06 CEST DEBUG2 syncThread: new sl_action_seq 1 - SYNC 713082
2007-05-04 17:13:07 CEST DEBUG2 localListenThread: Received event 2,713082 SYNC
2007-05-04 17:13:14 CEST DEBUG2 syncThread: new sl_action_seq 1 - SYNC 713083
[...]

Any idea ?

Thanks,
Alain.

Selon Christopher Browne <cbbrowne at ca.afilias.info>:

> Alain Peyrat <aljeux at free.fr> writes:
> > Hi,
> >
> > I'm running a simple replication between a master and a slave. The
> > slave is planned as a switchover host in case of big troubles on
> > the server (ie: slave can be stopped but not the master).
> >
> > My environnment:
> >
> > Postgres: 7.4
> > Slony: 1.2.6
> >
> > It seems that my replication is no more active and the tables
> > sl_log_1/2 are now very big (sl_log_1 is around 6GB and 3.3e+06
> > tuples). Value st_lag_time is very important.
> >
> > I have tried to restart the slon (master and slave) but the
> > st_last_received_event_ts is not changing (or only once for 3 sec).
> >
> > On the master slon log, I have (1 is master, 2 is slave).
> >
> > 2007-05-04 15:54:31 CEST DEBUG2 remoteListenThread_2: queue event 2,712628
> SYNC
> > 2007-05-04 15:54:31 CEST DEBUG2 remoteListenThread_2: UNLISTEN
> > 2007-05-04 15:54:31 CEST DEBUG2 remoteWorkerThread_2: Received event
> 2,712628
> > SYNC
> > 2007-05-04 15:54:31 CEST DEBUG3 calc sync size - last time: 1 last length:
> 10166
> > ideal: 5 proposed size: 3
> > 2007-05-04 15:54:31 CEST DEBUG2 remoteWorkerThread_2: SYNC 712628
> processing
> > 2007-05-04 15:54:31 CEST DEBUG2 remoteWorkerThread_2: no sets need syncing
> for
> > this event
> > 2007-05-04 15:54:32 CEST DEBUG2 localListenThread: Received event 1,1553949
> SYNC
> > 2007-05-04 15:54:32 CEST DEBUG2 syncThread: new sl_action_seq 6356493 -
> SYNC
> > 1553950
> >
> > So, I'm seeing line with 'no sets need syncing for this event'.
> >
> > Can someone help me understanding what's wrong ?
>
> You may find it useful to look at the following page for explanations for log
> entries:
>
> <http://slony.info/adminguide/slony1-1.2.6/doc/adminguide/loganalysis.html>
>
> The interesting log is not the one for the origin node; you can expect
> it to be mighty sparse, and when that node receives events from other
> nodes, it is very much to be expected that it will indicate "no sets
> need syncing for this event."
>
> More interesting will be the contents of the log file for the
> subscriber node; THAT will show a lot more action, and show log
> entries describing the work being done to pull and apply updates from
> the origin.
>
> But none of this is showing off anything that would be considered a
> "problem," and you won't see anything in the logs that will make it
> evident why you can't get the lag time down.
>
> The relevant slon option is "sync_interval_timeout", which is either
> controlled via that name in a "slon.conf" file, or via the command
> line "-s" option, which determines how long the slon sleeps between
> checking to see if there have been updates made that would warrant
> generating a SYNC event.
>
> I believe that the default for that is 10000, which represents 10s.
> If you reduced the value, that would cause SYNC events to be generated
> more frequently, which I would expect to reduce st_lag_time for you.
> --
> "cbbrowne","@","ca.afilias.info"
> <http://dba2.int.libertyrms.com/>
> Christopher Browne
> (416) 673-4124 (land)
>




More information about the Slony1-general mailing list