[Slony1-general] Changes to cleanup thread

Mon Dec 17 12:48:35 PST 2007

Simon Riggs <simon at 2ndquadrant.com> writes:
> On Fri, 2007-12-14 at 11:47 -0500, Christopher Browne wrote:
>> Simon Riggs <simon at 2ndquadrant.com> writes:
>
>> > I can't remember where we left the earlier discussion on the vacuuming.
>> > Did you want me to make a patch, or can I leave it with you guys? No
>> > reason to rush at all my end, just trying to remember where we were.
>> 
>> Actually, if you have have code in mind, I'd be pleased to see a
>> patch.  It would seem a good thing to enlarge the set of people that
>> have some familiarity with the internals :-).
>
> OK. Will do early next week.

FYI, the bulk of the code traditionally lived in the C file,
src/slon/cleanup_thread.c, and was mostly implemented in C.  I think
this was because the XXID code was pretty "hackish," and not terribly
nicely accessible or manipulable as "native SQL."

I think that what with the txid type being native, in 8.3, this code
might now all be able to go into a stored procedure, into
src/backend/slony1_funcs.sql, and most of the code should now be able
to be cleaner and *way* easier to maintain as pl/pgsql.  If we're
eliminating VACUUM, then in principle, substantially all the code
ought to be able to reside in the pl/pgsql function cleanupEvent().

I think it warrants having some more discussion of what the shape of
policy for this ought to be.

I had a discussion this afternoon with the Data Services manager, who
brought up the point that there may be some value to consciously
keeping sl_log_* data for rather longer periods of time than is
presently the case.  

Jan is doing some work on "CLONE NODE" functionality; as things stand,
right now, the understanding of how it would work, as a series of
activities, is something like:

 - Submit a "CLONE NODE(source id=3, dest id=6)" request, indicating
   that we intend to duplicate node #3 onto node #6.

   This would start putting "placeholder data" into place for node #6,
   indicating, notably, that it will be a copy of node #3, and 
   enforcing that sl_log_* data needs to be kept around.

 - Take a dump of node #3, and start loading it into node #6.

   In Pg 8.3+Slony-I-HEAD, with the absence of the "catalog smashing"
   of earlier versions, this might be done either via "pg_dump | psql"
   or via PITR or via some "disk array snapshotting" capability.
   All should provide equally legitimate databases.

 - Once node #6 has been fully loaded, run a "MUTATE NODE" against it.

   This would cause it to rename vital internal bits to indicate that
   it's now node #6, not node #3.

   Add in a STORE PATH or two and some SUBSCRIBE SET function calls
   and node #6 will be able to start reporting in how far behind it
   *really* is, and it can join the cluster as a real node.

At present, the "placeholder bit" is a necessary precursor - you need
to have something that's holding off the system from purging out
elderly sl_event/sl_confirm/sl_log_* data, otherwise, default
behaviour will tend to lead to them being trimmed out as quickly as 10
minutes after things have propagated.  (There's a '10 minute' interval
in function cleanupEvent() in src/backend/slony1_funcs.sql.)

It could be that we should be planning for a much longer lifecycle for
that data, that we tend to keep ~2 days worth of old data, so that we
could take a nightly pg_dump and choose to use it as much as a day
later to populate a new node.

The right answer may be for the interval (currently 10 minutes) to be
a parameter passed by slon to cleanupEvent.
-- 
output = ("cbbrowne" "@" "linuxdatabases.info")
http://linuxdatabases.info/info/linuxxian.html
"Armenians   and   Azerbaijanis  in   Stepanakert,   capital  of   the
Nagorno-Karabakh autonomous  region, rioted over  much needed spelling
reform in the Soviet Union."  -- Rolling Stone's '88 Year in Review