[Slony1-general] Slony lag times

Thu Aug 2 13:12:57 PDT 2007

Andrew Hammond wrote:
> As your sync interval decreases and load increases, you might need to 
> start throwing hardware at it. The criteria for deciding that are the 
> same as for any other database.
>
> Remember that slony is an asynchronous replication system. A system 
> designed with a strong dependency on very low replication lag time 
> suggests that you may be trying to fit asynchronous replication into 
> the role of synchronous replication.
[Aside: Other comments you made were ones I agreed with such that they 
needed no further comment]

It seems like a misdirection, to me, to point at 
asynchronous-versus-synchronous as being anything of an "approach."

It is certainly fair to say that Slony-I was designed to be 
asynchronous, and that this has the *necessary implication* that, under 
heavy load, other nodes will fall behind.

The alternative, however, isn't much of an alternative, from a number of 
perspectives:

1.  There isn't any synchronous replication system for PostgreSQL to 
correspond with Slony-I.  So part of what you're saying is, in effect, 
"You should get a synchronous replication system.  Oops!  There isn't 
one!  I keed, I keed!"  (The latter in the voice of Triumph the Insult 
Comic Dog...)

2.  If there was a synchronous replication system, it wouldn't change 
the answers in any particularly useful fashion.

What you'd discover is that, under load, the need to replicate 
synchronously would lead to the *master* node, itself, trying to fall 
behind, which would express itself as the master node getting more and 
more sluggish so as to prevent replication from falling behind.

On 8/2/07, *Dmitry Koterov* <dmitry at koterov.ru 
<mailto:dmitry at koterov.ru>> wrote:

    Aagrh, everybody says these words - "synchronous replication", but
    these words are useless in practice. I suppose there is NO good
    solution of Postgres synchronous replication for REALLY heavy-loaded
    systems, and saying "you possibly need a synchronous replication" is
    equal to saying "you must be reach and almighty". 

Actually, the point runs deeper, and "synchronous replication" is 
distracting you from the real problem.

The real problem is that you have GOT to accept trade-offs.  Trade-offs 
are a given, a necessity.

Synchronous replication wouldn't eliminate the problem - it would merely 
change the way it manifests.

If you haven't got powerful enough hardware to cope with the combined 
load of:
a) Update load
b) Query load
c) Load induced by replication work
then your overall system is certain to fail, in some fashion.

If using an asynchronous replication system, we well know that the 
failure that manifests is that replication falls behind on subscriber nodes.

If there was a synchronous replication system, well, synchronicity would 
prevent the "slaves" from falling behind; if the system is being 
overloaded by the workload, you'd find that the *master* node would 
start choking on its load.  Users would start seeing phenomena such as 
queries running unacceptably long and locks building up.

There is no "tuning fairy" to magically fix any of this; unfortunately, 
there isn't much choice, at that point, than to hope that you've got a 
"hardware fairy," some 'magical' place where you can get more powerful 
hardware. 

If more powerful hardware is not an option, I'm sorry, but we have 
nothing more to offer.  There is no magical "make it faster" switch to 
turn on.  If you can't afford the hardware that your application 
requires, then evidently you have promised more than it is possible to 
accomplish. 

Heroic efforts can occasionally cover over the need for more hardware, 
but in the long run, heros either:
a) Burn out, or
b) Get better offers from organizations that *can* afford the hardware.