[Slony1-general] Heuristic sl_listen generation

Thu Nov 4 14:57:19 PST 2004

I would think that the different listen paths provide better distribution of the data given the node configuration.  As an example situation, suppose node 1 suffers a critical failure.  Node 2 (or node 3) can be promoted and the cascading configuration of node 5 will continue because the listen path is set to receive data from node 2 through node 3.  Is that correct?

The only one that seems off is the following:

store listen (origin = 5, receiver = 4, provider = 3);  # 2 is 'optimal'

I think this path should have kept node 2 as the provider...(my script is currently generating these listen paths also)

I'm in the process of implementing a utility based on the altperl scripts to centralize their functionality for our operations group.  Part of the tool involves initializing the cluster based on a configuration file (for initial slony rollout), as well as adding/removing nodes, sets, sequences, subscriptions and performing failover / switchover.  I've only implemented about 15% of these features, but any additional "gotchas" suffered by others (in terms of node configurations) will go a long way towards preventing replication problems.

How would you recommend working out the listen paths when adding a node to an existing configuration?  Query the sl_listen table for the current paths?

Thanks,
Sean Kirkpatrick

-----Original Message-----
From: slony1-general-bounces at gborg.postgresql.org
[mailto:slony1-general-bounces at gborg.postgresql.org]On Behalf Of
Christopher Browne
Sent: Wednesday, November 03, 2004 1:32 PM
To: Slony Mailing List
Subject: [Slony1-general] Heuristic sl_listen generation

Below is a CVS HEAD patch "candidate" (I expect it to change before
being applied) that uses a heuristic to generate sl_listen entries
automatically rather than requiring them to be statically generated
(as in the "altperl" script, or by hand).

The approach is thus...

 - Each time we add a node subscription, we calculate two sets of
   'listens':

   1.  There's a direct connection between the new node and its parent
       that should override any other paths between those nodes;

   2.  For every other pair of nodes (a,b), where b<>a and
       it's not the connection from #1, we point (a,b) to go through
       the "parent."

 - When we do a MOVE SET, we run the
   GenerateListensOnSubscribe(provider,receiver) function against each
   of the subscribers in the subscription set to revise _all_ of the
   paths.

I think I've got it placed rightly in subscribeSet(); it may be that
it needs to get put some place else in moveSet().

There's a bit of a difference between what this generates and an
"optimal" set...  Consider the following set of nodes (in "altperl")
form:

  add_node(host => 'h1', dbname=>'oxrslive', port=>5432,
           user=>'postgres', node=>1);

  add_node(host => 'h2', dbname=>'oxrslive', port=>5432,
           user=>'postgres', node=>2, parent => 1);

  add_node(host => 'h3', dbname=>'oxrslive', port=>5432,
           user=>'postgres', node=>3, parent => 1);

  add_node(host => 'h4', dbname=>'oxrslive', port=>5432,
           user=>'postgres', node=>4, parent => 2);

  add_node(host => 'h5', dbname=>'oxrslive', port=>5432,
           user=>'postgres', node=>5, parent => 3);

The "optimal" network that gets generated by init_cluster.pl for this
is the following set of listens:

      store listen (origin = 1, receiver = 2, provider = 1);
      store listen (origin = 1, receiver = 3, provider = 1);
      store listen (origin = 1, receiver = 4, provider = 2);
      store listen (origin = 1, receiver = 5, provider = 3);
      store listen (origin = 2, receiver = 1, provider = 2);
      store listen (origin = 2, receiver = 3, provider = 1);
      store listen (origin = 2, receiver = 4, provider = 2);
      store listen (origin = 2, receiver = 5, provider = 3);
      store listen (origin = 3, receiver = 1, provider = 3);
      store listen (origin = 3, receiver = 2, provider = 1);
      store listen (origin = 3, receiver = 4, provider = 2);
      store listen (origin = 3, receiver = 5, provider = 3);
      store listen (origin = 4, receiver = 1, provider = 2);
      store listen (origin = 4, receiver = 2, provider = 4);
      store listen (origin = 4, receiver = 3, provider = 1);
      store listen (origin = 4, receiver = 5, provider = 3);
      store listen (origin = 5, receiver = 1, provider = 3);
      store listen (origin = 5, receiver = 2, provider = 1);
      store listen (origin = 5, receiver = 3, provider = 5);
      store listen (origin = 5, receiver = 4, provider = 2);

The heuristic provides _nearly_ the same result; there are only three
nodes where the listen paths differ:

      store listen (origin = 5, receiver = 4, provider = 3);  # 2 is 'optimal'
      store listen (origin = 5, receiver = 2, provider = 3);  # 1 is 'optimal'
      store listen (origin = 4, receiver = 3, provider = 2);  # 1 is 'optimal'

I don't _think_ that these slightly different paths present a problem,
as events are propagating to those providers already as a result of
the other paths.  Or am I off base?

Comments welcome...

Index: slony1_funcs.sql
===================================================================
RCS file: /usr/local/cvsroot/slony1/slony1-engine/src/backend/slony1_funcs.sql,v
retrieving revision 1.35
diff -c -r1.35 slony1_funcs.sql
*** slony1_funcs.sql	19 Oct 2004 01:16:06 -0000	1.35
--- slony1_funcs.sql	3 Nov 2004 16:48:43 -0000
***************
*** 1755,1760 ****
--- 1755,1767 ----
  	perform @NAMESPACE at .moveSet_int(p_set_id, v_local_node_id,
  			p_new_origin);

+ 	for v_sub_row in select sub_provider, sub_receiver 
+ 			from @NAMESPACE at .sl_subscribe
+ 			where sub_set = p_set_id
+ 	loop
+ 		perform @NAMESPACE at .GenerateListensOnSubscribe(v_sub_row.sub_provider, v_sub_row.sub_receiver)
+ 	done;
+ 
  	-- ----
  	-- At this time we hold access exclusive locks for every table
  	-- in the set. But we did move the set to the new origin, so the
***************
*** 3615,3620 ****
--- 3622,3632 ----
  			p_sub_receiver, p_sub_forward);

  	-- ----
+ 	-- Submit listen management events
+ 	-- ----
+ 	perform @NAMESPACE at .GenerateListensOnSubscribe(p_sub_provider, p_sub_receiver);
+ 
+ 	-- ----
  	-- Create the SUBSCRIBE_SET event
  	-- ----
  	return  @NAMESPACE at .createEvent(''_ at CLUSTERNAME@'', ''SUBSCRIBE_SET'', 
***************
*** 4436,4441 ****
--- 4448,4519 ----
  the creation of the serial column. The return an attkind according to
  that.';

+ 
+ -- ----------------------------------------------------------------------
+ -- FUNCTION GenerateListensOnSubscribe (provider, receiver)
+ --
+ --	Revises sl_listen rules when a new subscription is introduced
+ -- ----------------------------------------------------------------------
+ create or replace function @NAMESPACE at .GenerateListensOnSubscribe(int4,int4)
+ returns int
+ as '
+ declare
+ 	p_provider	alias for $1;
+ 	p_receiver	alias for $2;
+ 	v_row			record;
+ 	v_row2			record;
+ 	v_row3			record;
+ 	v_origin		int4;
+ 	v_receiver		int4;
+ 
+ begin
+ 	-- 1.  Drop out listens between this node and its parent other
+ 	--     than via it...
+ 	for v_row in select li_provider from @NAMESPACE at .sl_listen 
+ 			where li_origin = p_provider and
+ 			      li_receiver = p_provider and
+ 			      li_provider <> p_receiver
+ 	loop
+ 		perform @NAMESPACE at .droplisten(p_receiver, v_row.li_provider, p_receiver);
+ 	end loop;
+ 			
+ 	-- 2.  Add in the listener pointing this node to its parent
+ 	perform @NAMESPACE at .storelisten(p_receiver, p_receiver, p_provider);
+ 
+ 	-- 3.  Replace other listens...
+ 
+ 	for v_row in select no_id from @NAMESPACE at .sl_node
+ 	loop
+ 		v_origin := v_row.no_id;
+ 		for v_row2 in select no_id from @NAMESPACE at .sl_node
+ 			where no_id <> v_origin 
+ 		loop
+ 			v_receiver = v_row2.no_id;
+ 			if ((v_origin <> p_receiver) and (v_receiver <> p_receiver) then
+ 				-- Do nothing; no need to add entries
+ 			else
+ 				for v_row3 in select li_provider from @NAMESPACE at .sl_listen
+ 					where li_origin = v_origin and
+ 					      li_receiver = v_receiver and
+ 					      li_provider <> p_provider
+ 				loop
+ 					perform @NAMESPACE at .droplisten(v_origin,
+ 						v_row3.li_provider, v_receiver); 
+ 				end loop;
+ 				perform @NAMESPACE at .storelisten(v_origin, p_provider, v_receiver);
+ 			end if;
+ 		end loop;
+ 	end loop;
+ end;
+ ' language plpgsql;
+ 
+ comment on function @NAMESPACE at .GenerateListensOnSubscribe(int4,int4) is
+ 'GenerateListensOnSubscribe(p_provider, p_receiver)
+ 
+ Invoked by subscribeSet() and moveSet(), this revises the sl_listen
+ entries, adding in those entries required to allow communications
+ between other nodes and the receiver node that is being subscribed.';
+ 
  -- ----------------------------------------------------------------------
  -- FUNCTION tableHasSerialKey (tab_fqname)
  --

-- 
"cbbrowne","@","ca.afilias.info"
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 673-4124 (land)
_______________________________________________
Slony1-general mailing list
Slony1-general at gborg.postgresql.org
http://gborg.postgresql.org/mailman/listinfo/slony1-general