Christopher Browne cbbrowne
Mon Oct 16 11:11:40 PDT 2006
Junaili Lie wrote:
> This is the result from test_slony_state-dbi.pl
>
> DSN: dbi:Pg:dbname=MONSOON;host=localhost;port=5432
>
Pulling out some bits...
>
> --------------------------------------------------------------------------------
> Summary of event info
>  Origin  Min SYNC  Max SYNC Min SYNC Age Max SYNC Age
> ================================================================================
>
>       2      6872      6879     00:00:00     00:14:00    0
>       1     13934     13947     00:00:00     00:13:00    0
>
>
> ---------------------------------------------------------------------------------
> Summary of sl_confirm aging
>    Origin   Receiver   Min SYNC   Max SYNC  Age of latest SYNC  Age of
> eldest SYNC
> =================================================================================
>
>         1          2      13934      13946      00:01:00     
> 00:13:00    0
>         2          1       6872       6879      00:00:00     
> 00:14:00    0
>
>
> --------------------------------------------------------------------------------
> Summary of event info
>  Origin  Min SYNC  Max SYNC Min SYNC Age Max SYNC Age
> ================================================================================
>
>       2      6872      6879     00:00:00     00:14:00    0
>       1     13934     13947     00:00:00     00:13:00    0
>
>
> ---------------------------------------------------------------------------------
> Summary of sl_confirm aging
>    Origin   Receiver   Min SYNC   Max SYNC  Age of latest SYNC  Age of
> eldest SYNC
> =================================================================================
>
>         1          2      13934      13946      00:01:00     
> 00:13:00    0
>         2          1       6872       6879      00:00:00     
> 00:14:00    0
>

Well, the fact that you don't have any cases of elderly information in
sl_confirm or sl_event indicates that there doesn't appear to be any
problem with the flow of events, which represents a major class of
"kinds of problems."

It doesn't look as though anything is broken with your cluster that
would prevent cleaning out sl_log_1.

The next suggestion:  Grep for cleanup thread entries in the logs for
both slons.

Here's a sample excerpt for a not-too-busy set of nodes.  Note the
longer times at 16:30/16:31; there was evidently some "blip" of a bunch
of data to clean out.

2006-10-16 16:09:39 UTC DEBUG1 cleanupThread:    0.046 seconds for
delete logs
2006-10-16 16:20:23 UTC DEBUG1 cleanupThread:    0.115 seconds for
cleanupEvent()
2006-10-16 16:20:23 UTC DEBUG1 cleanupThread:    0.221 seconds for
delete logs
2006-10-16 16:30:43 UTC DEBUG1 cleanupThread:    0.068 seconds for
cleanupEvent()
2006-10-16 16:30:46 UTC DEBUG1 cleanupThread:    3.914 seconds for
delete logs
2006-10-16 16:31:08 UTC DEBUG2 cleanupThread:   21.869 seconds for vacuuming
2006-10-16 16:41:34 UTC DEBUG1 cleanupThread:    0.081 seconds for
cleanupEvent()
2006-10-16 16:41:34 UTC DEBUG1 cleanupThread:    0.066 seconds for
delete logs
2006-10-16 16:52:03 UTC DEBUG1 cleanupThread:    0.069 seconds for
cleanupEvent()
2006-10-16 16:52:03 UTC DEBUG1 cleanupThread:    0.059 seconds for
delete logs
2006-10-16 17:02:36 UTC DEBUG1 cleanupThread:    0.070 seconds for
cleanupEvent()
2006-10-16 17:02:36 UTC DEBUG1 cleanupThread:    0.059 seconds for
delete logs
2006-10-16 17:02:37 UTC DEBUG2 cleanupThread:    0.841 seconds for vacuuming
2006-10-16 17:13:02 UTC DEBUG1 cleanupThread:    0.070 seconds for
cleanupEvent()
2006-10-16 17:13:02 UTC DEBUG1 cleanupThread:    0.123 seconds for
delete logs
2006-10-16 17:23:19 UTC DEBUG1 cleanupThread:    0.079 seconds for
cleanupEvent()
2006-10-16 17:23:19 UTC DEBUG1 cleanupThread:    0.081 seconds for
delete logs
2006-10-16 17:33:37 UTC DEBUG1 cleanupThread:    0.073 seconds for
cleanupEvent()
2006-10-16 17:33:37 UTC DEBUG1 cleanupThread:    0.411 seconds for
delete logs
2006-10-16 17:33:45 UTC DEBUG2 cleanupThread:    7.732 seconds for vacuuming
2006-10-16 17:44:29 UTC DEBUG1 cleanupThread:    0.116 seconds for
cleanupEvent()
2006-10-16 17:44:29 UTC DEBUG1 cleanupThread:    0.058 seconds for
delete logs
2006-10-16 17:55:03 UTC DEBUG1 cleanupThread:    0.071 seconds for
cleanupEvent()
2006-10-16 17:55:03 UTC DEBUG1 cleanupThread:    0.059 seconds for
delete logs
2006-10-16 18:05:40 UTC DEBUG1 cleanupThread:    0.071 seconds for
cleanupEvent()
2006-10-16 18:05:40 UTC DEBUG1 cleanupThread:    0.063 seconds for
delete logs
2006-10-16 18:05:51 UTC DEBUG2 cleanupThread:   11.087 seconds for vacuuming

A problem could arise if your slon processes aren't "living" long enough
to do a good cleanup, if you're not actually seeing these sorts of steps
complete.  (Which should happen about every 10 minutes.)

It could be that you did a whole lot of updates a little while ago, and
there's simply some waiting before the log tables get cleaned out.



More information about the Slony1-general mailing list