[Slony1-general] Listen path generation and bug 1485

Wed Mar 1 05:20:18 PST 2006

Christopher Browne wrote:
> I have a "followup question" concerning #1485...
> 
> When you ran into problems with your five node example shaped like...
> 
>        1  -->  10 --> 11
>            -->  20 --> 21
> 
> (e.g. - 1 feeds 10 and 20, then 10 feeds 11, 20 feeds 21, and we have
> paths exactly corresponding to 2-way communications there...)
The paths are "correct" (i.e. the same as in my configuration). The
"feeds" are not - in my case, 10 is origin of set 1, which is subscribed
by 11. Ditto for 20, set 2 and 21.

> It's not entirely clear what your path/listener configuration looked like.
> 
> From what I can see, it should look like the following:
> 
> /* cbbrowne@[local]/dba2 slonyregress1=*/ select * from
> _slony_regress1.sl_path  order by pa_server, pa_client;
>  pa_server | pa_client | pa_conninfo | pa_connretry
> -----------+-----------+-------------+--------------
>          1 |        10 |             |            
>          1 |        20 |             |            
>         10 |         1 |             |            
>         10 |        11 |             |            
>         11 |        10 |             |            
>         20 |         1 |             |            
>         20 |        21 |             |            
>         21 |        20 |             |            
> (8 rows)
yep, exactly

> 
> (Obviously, there would be more useful DSN values in a real system ;-).)
> 
> Based on the above, and no subscription information, the current,
> production listen path builder builds the following:
<snipped table>

This sl_listen seems to be ok, but I guess slony could benefit from
"circle-elimination".
  li_origin | li_receiver | li_provider
-----------+-------------+------------
         10 |          20 |          21
         11 |           1 |          20
         11 |          20 |          21
         20 |          10 |          11
         21 |           1 |          10
         21 |          10 |          11

Those sl_listen would implied event-flows like a -> b -> c -> b.

> If I define the set and subscriptions as described:
> 
> /* cbbrowne@[local]/dba2 slonyregress1=*/ select * from
> _slony_regress1.sl_set;
>  set_id | set_origin | set_locked | set_comment
> --------+------------+------------+-------------
>       1 |          1 |            |
For my case, that should be:
  set_id | set_origin | set_locked | set_comment
--------+------------+------------+-------------
       1 |         10 |            |
       2 |         20 |            |

> /* cbbrowne@[local]/dba2 slonyregress1=*/ select * from
> _slony_regress1.sl_subscribe ;
>  sub_set | sub_provider | sub_receiver | sub_forward | sub_active
> ---------+--------------+--------------+-------------+------------
>        1 |            1 |           10 |             |
>        1 |            1 |           20 |             |
>        1 |           10 |           11 |             |
Again, this is not my case, mine would be:
  sub_set | sub_provider | sub_receiver | sub_forward | sub_active
---------+--------------+--------------+-------------+------------
        1 |           10 |           11 |             |
        2 |           20 |           21 |             |

> (3 rows)
> 
>  li_origin | li_receiver | li_provider
> -----------+-------------+-------------
>          1 |          10 |           1
>          1 |          11 |          10
>          1 |          20 |           1
>          1 |          21 |          20
>         10 |           1 |          10
>         10 |          11 |          10
>         10 |          20 |           1
>         10 |          21 |          20
>         11 |           1 |          10
>         11 |           1 |          20
>         11 |          10 |          11
>         11 |          20 |           1
>         11 |          21 |          20
>         20 |           1 |          20
>         20 |          10 |           1
>         20 |          10 |          11
>         20 |          11 |          10
>         20 |          21 |          20
>         21 |           1 |          10
>         21 |           1 |          20
>         21 |          10 |           1
>         21
For your setup, those are correct, as far as I can see (Well, it
could still benefit from "circle-elimination", similar to the
no-sets case above, but thats just an optimization).

The code works for your case, because of the dataflow of your example
(1 -> 10 -> 11, and 1 -> 20 -> 21). When deciding, for example, how
node 10 should listen for events from node 21, the subscritions (1 -> 10)
and (10 -> 11) couse RebuildListenEntriesOne to use 11 and 1 as providers.
Now, using 11 is useless, because node 11 won't get events from node 21
unless 10 already has them, but using 1 works, because 1 gets them from 20,
which in turn gets them from 21.

In my case, however, there is not (1 -> 10) and (1 -> 20) subscription,
so for the "how does node 10 get events originating from 21" case,
RebuildListenEntriesOne now _only_ chooses 11 as a provider, which, due
to the same reason as above, is a useless provider. The
difference from your case is, that it is now the _only_ provider taken
into accound by RebuildListenEntriesOne (because of the "if found then return statement").

> The difference is these two listener entries, which, analytically, it
> makes sense can be dropped without adversely affecting the cluster.  (If
> you think about it, node 21 is a *crummy* node for node 20 to use as a
> provider for events from node 10, and ditto for 21 as a provider to 20
> for events from node 11.)
> 
>  li_origin | li_provider | li_receiver
> -----------+-------------+-------------
>         10 |          21 |          20
>         11 |          21 |          20
> (2 rows)
> 
> Would it be possible for you to provide a more exact list of what the
> (apparently wrong) listen network looked like?  I don't see the existing
> calculation breaking down, in this case.
Since I patches RebuildListenEntriesOne on all nodes of my cluster, I can't
easily regenerate what vanilla slony was doing. But I hope that I explained
the problem (or, rather, what I perceive to be the problem) clearly enough
in the text above.

> By the way, the algorithm that you propose seems only to differ in one
> material way from the existing one:  The existing one will prefer more
> direct listen paths based on sl_subscribe to those of "lesser" status. 
> That is what led to the 2 row difference above.  That difference
> shouldn't cause the problem you observed, in that all it does is to
> eliminate some paths that wouldn't have been useful anyways.
No, I don't think so. My suggested algorith was the following:

"Telling node X via sl_listen to ask neighbour-nodes
(Those for which a sl_path entry exists) for events from all other nodes,
apart from those for which the events must have travelled via node X to
reach the neighbour of X in question. aths that wouldn't have been useful anyways."

A more formal description, assume that the slony cluster forms a directed graph
where each node represents a vertex, and each sl_path entries represents a (directred)
edge from pa_server to pa_client.
1) Generate all possible traversals of the graph, in the following form:
    [origin, node1, node2, ... provider, receiver]
    (There must be edges origin->node1, node2->node2, ... nodeN->provider, provider->receiver
    in the graph).
2) Remove all traversals that include circles, or in other words where a
    node is traversed more than once (e.g a->b->c->b->d).
3) Group those traversals by (origin, provider, receiver)

According to someone here on this list, this will have to be refined a bit,
because the events from a set-origin to the subscribers must travel by the same
way as the data does. This amounts to adding the following step to the algorithm
stated above

4) Remove all traversals (origin,provider,receiver) where receiver subscribes a set
    originating from origin, but where provider is not the set-provider for receiver.

The code I wrote, and that is included in the bugreport, was an attempt to implement
that algorithm. The major problem is that the only "easy" way to do this, given the current
structure of slonys system-tables, is an iterative approch, where the _already_grouped_
traversals are build by extending them by one node in every step. This already-grouped
representation, however, lacks the information needed to correctly satisfy the "circle-freeness"
requirement. (Because you don't know exactly which nodes an event traverses when travelling
from origin via provider to receiver).

Maybe I working compromise would be to remove this "circle-freeness" requirement - I don't know
how much of an performance impact a few unnecessary sl_listen entries generate.

Another solution would be to add a table sl_listen_raw, with the fields (origin, way, receiver),
and _no_ primary key. Using this table, coding the algorithm above would be trivial, and sl_listen
could afterwards be created from sl_listen_raw in just one step (Using a group operation)

I'd prefer the second solution, and would be willing to code it - but I guess adding a table
to a stable release is quite a showstopper for you...

greetings, Florian Pflug

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4116 bytes
Desc: S/MIME Cryptographic Signature
Url : http://gborg.postgresql.org/pipermail/slony1-general/attachments/20060301/6f7f1b3a/smime.bin