[Slony1-general] Slony lag times

Thu Aug 2 12:17:46 PDT 2007

On 8/2/07, Dmitry Koterov <dmitry at koterov.ru> wrote:
>
> Aagrh, everybody says these words - "synchronous replication", but these
> words are useless in practice. I suppose there is NO good solution of
> Postgres synchronous replication for REALLY heavy-loaded systems, and say=
ing
> "you possibly need a synchronous replication" is equal to saying "you must
> be reach and almighty".

No, it's more like saying "You're trying to get this tool to do something
it's not designed to do. Here is the formal term used to describe the tool
you seem to want." If you insist on using PostgreSQL then... no, such a tool
doesn't currently exist (at least not in a state that people are
recommending it for production). That is _not_ useless in practice. When
designing a system, it is valuable to know what is _not_ possible with a
given tool.

> Lag of 1-10 seconds is really usual. It's an unavoidable evil for slony,
> unfortunately.

I agree that, given the existing design, a lag is unavoidable. Hence the
term asynchronous. Exactly how is this evil? It's _exactly_ what the system
is designed to be.

Try to read some data critical for time from a master.
>
> On 8/2/07, Andrew Hammond <andrew.george.hammond at gmail.com> wrote:
>
> > On 7/30/07, Laurent Raufaste <laurent at over-blog.com> wrote:
> > >
> > > Hi,
> > >
> > > We are using Slony on a production environment and are very pleased by
> > > it.
> > >
> > > Our cluster is made of 1 master, 4 slaves that needs to be replicated
> > > fast, and 2 slaves for which the replication speed isn't a problem.
> > >
> > > Here's our issue: In the sl_status view I notice that the st_lag_time
> > > is
> > > always between 1 and many seconds: it goes up to 10 seconds regularly,
> > > and approximatively one time a day, there is always a slave reaching 1
> > >
> > > min, for example while vacuuming.
> >
> >
> > 1 to 10 seconds is pretty good. If there aren't any changes to tables on
> > the origin then SYNC events are only generated every 10 seconds (by
> > default). Do you have enough DML (inserts, updates and deletes) traffic=
 to
> > keep the syncs flowing? Even if there are plenty of changes to record, =
by
> > default, syncs are only generated every 2 seconds, so any lag time of l=
ess
> > than 2 seconds is effectively as up to date as possible.
> >
> > I tried playing with the folllowing options:
> > >      -s <milliseconds>     SYNC check interval (default 10000)
> > >      -t <milliseconds>     SYNC interval timeout (default 60000)
> > >      -o <milliseconds>     desired subscriber SYNC processing time
> > >      -g <num>              maximum SYNC group size (default 6)
> > >
> > > Now on the master I have:
> > > -s 1000 -g 50
> > > On the fast slaves I have:
> > > -s 1000
> > > And on the slow slaves:
> > > -s 10000 -g 10
> >
> >
> > Are you running slony 1.2? Those defaults look wrong. If you aren't
> > running 1.2.latest then you should consider upgrading.
> >
> >
> > > I tried lowering the SYNC check interval to 500ms with no real effect,
> > >
> > > and the master is already loaded enough anyway ;)
> > >
> > > Is there an effective way to shorten the replication lag time ?
> >
> >
> > If it's because of inactivity, then you can decrease your
> > sync_interval_timeout, however the real effect of this is to just make =
the
> > lag time look better. (There's an edge case here involving sequences)
> >
> > If you have plenty of activity then you _might_ consider lowering your
> > sync_interval (the rate at which slony checks to see if there's anythin=
g to
> > put into a new sync) on your origin. This will increase the load on your
> > origin, so you will want to do it incrementally and measure the effect.
> >
> > As your sync interval decreases and load increases, you might need to
> > start throwing hardware at it. The criteria for deciding that are the s=
ame
> > as for any other database.
> >
> > Remember that slony is an asynchronous replication system. A system
> > designed with a strong dependency on very low replication lag time sugg=
ests
> > that you may be trying to fit asynchronous replication into the role of
> > synchronous replication.
> >
> > Andrew
> >
> > _______________________________________________
> > Slony1-general mailing list
> > Slony1-general at lists.slony.info
> > http://lists.slony.info/mailman/listinfo/slony1-general
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20070802/=
29afe025/attachment.htm