[Slony1-general] recovery

Thu Oct 14 23:18:10 PDT 2004

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Parker wrote:
| Given 2 nodes A and B in a slony replication cluster, with A as the
| provider for a single set, if at some point I need to failover to B,
| what are my options for eventually returning A to provider status? Once
| we've failed over to B, we eventually want to get A back to being the
| main guy.

Are you thinking of "FAILOVER", or just moving the set's "master" location?

The Slonik FAILOVER command essentially discards the failed node, at
which point the only way to add it back involves rebuilding it from scratch.

If you can instead use MOVE SET to move the "master" provider to the new
node, that is a much more benign event.  The "former master" can
continue to participate in replication, and that is very much preferable.

| It looks like the first step is to set up a fresh schema in A, and
| subscribe it to B. I thought at first I would want to populate it first
| with a pg_dump from B, but I see that the ENABLE_SUBSCRIPTION logic
| truncates the target table before copying to it.

Slony-I does this itself so that it can be certain that nothing gets in
edgewise to corrupt the data.

| Will simply subscribing the new A to B result in all existing B data
| geting copied over? If so, how does this compare performance-wise with
| using pg_dump? (I realize this comparison is apples vs. oranges, and
| really a comparision between table copy and pg_dump, hence not
| slony-specific, but if anybody has any experience with this I'd
| appreciate hearing about it).

The "delete from table; copy table from stdin;" approach is definitely
not nearly as fast as using pg_dump.  It is conspicuously slower because
this approach involves populating the indices all in parallel, as
entries are added to the table, which is _way_ slower than creating them
~ at the end, in one fell swoop.

| Assuming I'm able to fully load A with data from B, is it possible to
| switch them such that A becomes the provider and B goes back to receive
| mode? This is complicated if updates are continuing to B, of course, but
| I wonder if it is possible at all.

Part of the point of the "complications" in Slony-I is to allow you to
do exactly what you are talking about.

The Slonik command MOVE SET is intended to do exactly what you are
asking about.  The nodes need to be locked, for a little while, in order
to allow A to catch up with B; once A has caught up, it can take over as
"master", and B will become the "slave."

We have done this on one of our production systems; we found that there
was a problem with a "master" server, and therefore used MOVE SET to
switch over to a secondary server to give opportunity to take the former
~ "main" node out of service for maintenance.  We haven't switched back,
but we could have using another MOVE SET operation.

| Finally, what query would I execute against 2 nodes to know if their
| data is in sync (at that moment)?

Some of our systems are sufficiently busy that the result of such a
query wouldn't be accurate long enough for someone to be able to read it
:-).

I have set up some replication tests to see how close nodes are; it's
essentially based on looking at an application table and looking at the
latest activity from the _application's_ perspective.  Since my
application differs from your application, my queries would be useless
to you :-(.

There's a new view in 1.0.3 called sl_status that may be of some value...

~ -- ----------------------------------------------------------------------
- -- VIEW sl_status
- --
- --	This view shows the local nodes last event sequence number
- --	and how far all remote nodes have processed events.
- -- ----------------------------------------------------------------------
create or replace view @NAMESPACE at .sl_status as select
	E.ev_origin as st_origin,
	C.con_received as st_received,
	E.ev_seqno as st_last_event,
	E.ev_timestamp as st_last_event_ts,
	C.con_seqno as st_last_received,
	C.con_timestamp as st_last_received_ts,
	CE.ev_timestamp as st_last_received_event_ts,
	E.ev_seqno - C.con_seqno as st_lag_num_events,
	current_timestamp - CE.ev_timestamp as st_lag_time
	from @NAMESPACE at .sl_event E, @NAMESPACE at .sl_confirm C,
		@NAMESPACE at .sl_event CE
	where E.ev_origin = C.con_origin
	and CE.ev_origin = E.ev_origin
	and CE.ev_seqno = C.con_seqno
	and (E.ev_origin, E.ev_seqno) in
		(select ev_origin, max(ev_seqno)
			from @NAMESPACE at .sl_event
			where ev_origin = @NAMESPACE at .getLocalNodeId('_ at CLUSTERNAME@')
			group by 1
		)
	and (C.con_origin, C.con_received, C.con_seqno) in
		(select con_origin, con_received, max(con_seqno)
			from @NAMESPACE at .sl_confirm
			where con_origin = @NAMESPACE at .getLocalNodeId('_ at CLUSTERNAME@')
			group by 1, 2
		);
comment on view @NAMESPACE at .sl_status is 'View showing how far behind
remote nodes are.
';
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBbvsiCVn6LJfHIAIRAo4vAKCj3z9hI/o2beEy/C78muk0G0lQvACfZJD1
TIjD2LExkoEldeCP3P+VTys=
=/tq6
-----END PGP SIGNATURE-----