[Slony1-general] gotchas from experience ?

Wed May 31 09:26:29 PDT 2006

Rod Taylor wrote:
> Node 4 subscribes to Node 1, but Node 1 will sometimes get hundreds of
> connections to Node 4 all listening for NOTIFY events. Obviously it
> cannot recover from this (pg_listener activity is too significant).
The next time you run into this, can you check as to what they are
listening for?

select * from pg_listener;  will answer that.

Typical contents are like...

oxrsorg=# select * from pg_listener order by listenerpid, relname;
     relname      | listenerpid | notification
------------------+-------------+--------------
 _oxrsorg_Confirm |       33628 |            0
 _oxrsorg_Event   |       33628 |            0
 _oxrsorg_Confirm |      213248 |            0
 _oxrsorg_Event   |      213248 |            0
 _oxrsorg_Confirm |      217112 |            0
 _oxrsorg_Event   |      217112 |            0
 _oxrsorg_Confirm |      254016 |            0
 _oxrsorg_Event   |      254016 |            0
 _oxrsorg_Confirm |      295822 |            0
 _oxrsorg_Event   |      295822 |            0
 _oxrsorg_Event   |      319642 |            0
 _oxrsorg_Restart |      319642 |            0
 _oxrsorg_Confirm |      402010 |            0
 _oxrsorg_Event   |      402010 |            0
(14 rows)

You see 3 values there:

_oxrsorg_Restart

This one is held by the connection that processes the local worker
thread.  This entry used to be used as the "interlock" to prevent 2
slons from managing a node at a time.  It doesn't get "abused"
particularly much...

_oxrsorg_Confirm

There should be one of these for each remote node that connects.  It
gets notified each time there's a confirmation.

In 1.2, this goes away entirely; there is ZERO need to handle
confirmations via notifications; we just process confirmations each time
we go through the event loop.

_oxrsorg_Event

Again, there's one for each remote node that connects.  Each time there
is an event, this gets notified.

In 1.2, any time an event gets processed, the remote listener goes into
a polling mode for a while, which eliminates its use of pg_listener.

I *think* that 1.2 should alleviate your scenario; pg_listener shouldn't
grow large, so there should be no reason for a slon to decide that a
remote listener has died and start another one.

I *think* that's the case...  That's the way that it looks like the slon
for your node 1 could "go mad" and connect too many times to node 4.

Obvious point:  1.2 needs to get released :-).