Fri Dec 23 16:05:39 PST 2005
- Previous message: [Slony1-general] Bug in RebuildListenEntries (ID: 1485)
- Next message: [Slony1-general] Bug in RebuildListenEntries (ID: 1485)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 12/23/2005 4:57 AM, Florian G. Pflug wrote: > Jan Wieck wrote: >> On 12/21/2005 8:18 PM, Florian G. Pflug wrote: >> >>> Florian G. Pflug wrote: >>> <snipped my own mail> >>> >>> Can anyone confirm that this is actually a bug? I pretty sure >>> (Did multiple setups of my cluster, and the problem persisted - >>> I used the altperl scripts for setting up the cluster, so I >>> see no way I could have causes this). >>> >>> If it's really I bug, I would at least be worth a note in >>> the docs or in the 1.1.5 release notes - I took me hours to >>> nail down the problem, and it wasn't fun, so preventing >>> others from having to do the same would be a good thing. >> >> Rebuild listen entries is indeed broken. This is a show stopper for >> 1.1.5 ... I am working at it. > > Is there a reason for not generating all "sensible" sl_listen entries? > I didn't find any documentation on the performance overhead a > sl_listen entry causes. Exactly the "sensible" part of that all is important. Problems arise when a node receives an event from any set-origin, which has not yet been processed by its data provider for that set. For example 1 -> 2 3 1 being origin, 2 is subscriber, 3 is a new node not subscribed yet. 3 has paths for 1 and 2, so naturally it would listen on each of them for their events. If we now subscribe 3 as a cascaded node with 2 as its data provider, the ENABLE_SUBSCRIPTION event that will follow from node 1, on which node 3 will start copy_set, must be received by 3 from 2. That is the only way that 2 at the moment where 3 starts to copy data actually has data itself. It could still be busy with it's own copy_set, meaning that not only the data in the tables is missing, the tables themself aren't in sl_table either yet. And to spice this up a little more, reading the events is done async in the remote_listen thread. They are queued and the remote_worker thread will process them from the queue. At the moment where node 3 gets the SUBSCRIBE_SET event, it will have a lot of stuff already queued, so it better restart ASAP to throw that away and listen again, this time for all 1-events on 2. Jan > > With "sensible" I mean: Telling node X via sl_listen to ask neighbour-nodes > (Those for which a sl_path entry exists) for events from all other nodes, > apart from those for which the events must have travelled via node X to > reach the neighbour of X in question. > > I tried writing an algorithm to do that, but it turned out that isn't quite > as easy as I initially believed, because all "iterative" algorithms > I could think off (Which were all based basically on the idea, that > if X receives events from Y, and Y from Z, then X can receive events from Z > via Y) failed because there is not enough information in sl_listen to figure > out if Y already needs X receive events from Z). > > greetings, Florian Pflug -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck at Yahoo.com #
- Previous message: [Slony1-general] Bug in RebuildListenEntries (ID: 1485)
- Next message: [Slony1-general] Bug in RebuildListenEntries (ID: 1485)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list