Vick Khera vivek at khera.org
Tue Apr 16 06:10:58 PDT 2013
For years, I have run slony (currently version 2.1.3) with one origin and
one replica. Every night at midnight, I run vacuum analyze on the whole DB.
 I still run autovacuum with its default settings. The midnight vacuum
takes approximately 4.5 hours to run.

All was well until I upgraded the DB from 8.4 to 9.2.x. (at the same time I
upgraded slony from 2.0).  Now, every night, the replication basically
stalls mid-way through the vacuum.

Here is what I observe:

midnight - vacuum starts on origin and replica
3:12am - replication delay reaches > 7 minutes
3:15am - replication delay = 624 seconds
3:30am - replication delay = 1524 seconds
3:45am - replication delay = 2423 seconds
... basically replication has stopped
4:30am - replication delay = 5124 seconds
4:40am - vacuum ends on replica
4:41am - vacuum ends on origind
4:45am - replication delay = 1018 seconds
4:49am - replication lag drops to under 5 minutes (I consider this
recovered)


At no other time during the day, even when the DB is very very busy doing
lots of writes and a fair number of reads, does the replication lag more
than 5 or 10 seconds.

I have another DB on another pair of machines that is reasonably large as
well, that does nightly vacuum similarly. It is running slony 2.1 also, but
the DB version it replicates from is 8.3 to a 9.1. I never see any massive
delay in replication on there.

So my instinct is that there is some change in 9.2 that slony is tripping
over that is causing it to lock something for way too long. I would
appreciate any guidance on figuring out what that is, so I can avoid having
long delays in my replication while vacuum is running.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20130416/9ce315bd/attachment.htm 


More information about the Slony1-general mailing list