[Slony1-general] I'm using Slony-I 1.2.20. Should I upgrade to 1.2.21 or 2.0.5?

Thu Oct 7 11:48:53 PDT 2010

Aleksey Tsalolikhin <atsaloli.tech at gmail.com> writes:
> Hi.  We're using 1.2.20 right now.  It's not the latest version either
> way you look at it.  Should we upgrade to 1.2.21 or 2.0.5?
>
> We have 2 replication sets now: the whole shebang, and selected tables
> and sequences.
>
> It's working really well.
>
> Our only pain points are listed below.  Would upgrading to 2.0.5
> help with these?  Or if there is some other compelling reason to
> move from 1.x branch to 2.x?  So far the only difference I was able
> to fathom is that replicating sequences is much much more efficient
> in 2.x.   Is there anything else?
>
> Our pain points are:
>
> -  the database runs a bit slower slower when replication is enabled.
>
>    Slony seems to have quite a bit of overhead:
>
>    I see it doing about 890  inserts per second, and same again for
>    deletes.   that's 1780 SQL ops per second.
>
>    By comparison, updates (and slony does not use updates AFAIK so this is
>    my application's queries) run about 20 updates per second.
>
>    I don't yet understand why we have a 1:89 ratio between updates by the
>    application and Slony inserts/deletes.  I don't know Slony internals
>    well enough yet.  But would going to 2.x change that ratio?
>    We do replicate sequences.

BTW, Slony *does* do UPDATEs; pulling the little bit from
src/slon/remote_worker.c:

	/*
	 * Add the actual replicating command to the line buffer
	 */
	line->line_largemem += largemem;
	switch (*log_cmdtype)
	{
		case 'I':
			slon_appendquery(&(line->data),
							 "insert into %s %s;\n",
							 wd->tab_fqname[log_tableid],
							 log_cmddata);
			num_inserts ++;
			break;
		case 'U':
			slon_appendquery(&(line->data),
							 "update only %s set %s;\n",
							 wd->tab_fqname[log_tableid],
							 log_cmddata);
			num_updates ++;
			break;
		case 'D':
			slon_appendquery(&(line->data),
						   "delete from only %s where %s;\n",
							 wd->tab_fqname[log_tableid],
							 log_cmddata);
			num_deletes ++;
			break;
	}

The 'U' bit in the middle does updates :-).

Do you have a lot of sequences?  

In 1.2, every sequence gets its values updated every time a SYNC is
processed, which means that if you have a bunch that are seldom touched,
they'll lead to quite a few updates travelling around.

In 2.0, the sequence handling is rather lazier; if the value hasn't
changed since the last SYNC, the slon doesn't bother propagating
anything for that sequence.

If you've got 80 sequences, that's 80-ish updates per SYNC that are
present/absent, which could be a big chunk of what you're seeing.

There are also significant changes in 2.0 in how sl_log_1/2 are trimmed,
which should be helpful.  But I don't expect that's what you're
observing.

> - sometimes after the slave goes offline, or there is a WAN problem,
>   the check_postgres.pl Nagios plug-in (check_slony) reports the slave
>   as lagged, even after the slave and master are both online and can
>   talk to each other -- restarting the slon master and slave resolves this.
>   Otherwise the replication lag time does not decrement / catch up, but
>   falls further behind with the clock.
>
> So do we have a good reason to go to 2.x?  Or should we let it shake out
> a bit longer?

Well, we've kept releasing 2.0.x versions, which hasn't been a good
thing.  But 2.0.5 comes out of having shaken things pretty hard, which
led to >>20 bugs being fixed.  And Steve & Jan have both been running a
lot of tests *after* the committing of all the relevant patches; it has
been two weeks since the last bug fix was committed.

This *should* imply pretty good things for 2.0.5.
-- 
output = reverse("ofni.sailifa.ac" "@" "enworbbc")
Christopher Browne
"Bother,"  said Pooh,  "Eeyore, ready  two photon  torpedoes  and lock
phasers on the Heffalump, Piglet, meet me in transporter room three"