[Slony1-general] Slony replication falling behind

Mon Dec 20 09:28:10 PST 2010

On 12/20/2010 11:50 AM, John Cheng wrote:
> Hi,
>
> We have a production system where our Slony replication process (one
> master to one slave) appears to be falling behind. The sl_log_1 table
> on the master keeps increasing in size while the slave is consistently
> 100% utilized. The slave appears to be executing update/insert/delete
> statements that have been queued up by Slony.
>
> I'm not sure how to go about troubleshooting whether slony is just
> falling behind or if it is not replicating at all.
>
> I suspect a new nightly process that batch deletes and loads hundreds
> of thousands of records could've caused this behavior,

This is a known problem and will be addressed in version 2.1. At this 
point I unfortunately cannot tell when exactly 2.1 will be released.

See bug 167 here: http://www.slony.info/bugzilla/show_bug.cgi?id=167

We also consider backpatching this into 2.0, but that is a rather 
drastic change to a STABLE branch.

> ... and the tables
> involved in this process should not be replicated by Slony. If I were
> to remove this table from the replication set (using the SET DROP
> TABLE slonik script), would that automatically help Slony replication
> catch up? How does Slony handle pending updates when you remove an
> affected table from the replication set.

Slony would not propagate that SET DROP TABLE until it has caught up 
with the backlog.

You could note the table ID from sl_table, remove it from the set via 
SET DROP TABLE and then manually delete all rows from sl_log_1 and 
sl_log_2 that belong to that log_tableid. This should cause any ill side 
effects.

Jan

-- 
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin