[Slony1-general] Figuring out replication is finished / replicas are same

Tue Feb 8 16:47:07 PST 2005

On Feb 8, 2005, at 10:42 AM, Christopher Browne wrote:

> 1.  You should leave the slon running against the origin node all the 
> time, including when it is undergoing the heavy processing.
>
> If you shut it off, you'll discover that all of the changes during 
> that 30 minute period are treated as one really big SYNC, and things 
> will behave badly when you turn on the "subscriber" slon as it tries 
> to grab all the data at once.
>

I will second this motion.  I did something to my DB last week that 
caused the server to run out of file handles globally, causing an 
unclean termination of all connections followed by a recovery on the 
origin.  I neglected to restart the slon talking to the origin for 
about 13 hours (can someone say "need better monitoring scripts"?).  It 
took about 5 days to catch up, and the DB was dog slow the whole time.  
So much so that we were having trouble doing work.

[ ... ]
> 2.  You should try to keep the slons running most of the time so that 
> the systems are largely kept in sync and so that there is not a large 
> buildup of rows in sl_log_1 and sl_seqlog.
>
> If you shut off replication for days at a time, those tables will 
> often build up, and performance will be questionable.
>

In the end, after I was all caught up, I had to vacuum full the 
sl_log_1 table (and reindex for good measure) on the subscriber.  There 
were millions of dead tuples.  Curiously, I had to kill the subscriber 
slon to release them to the vaccum, even though it wasn't in a 
transaction.  Either that or it was a coincidence that the second 
vacuum got to reclaim the rows :-)

Performance is again excellent, if not better than before.  Perhaps a 
vacuum full of the slony tables after the initial copy is a good idea?