Jan Wieck JanWieck
Fri Mar 11 16:03:26 PST 2005
On 3/10/2005 11:41 PM, cbbrowne at ca.afilias.info wrote:

>> I have checked in a fair "boatload" of patches supporting this and that;
>> from my perspective, I think it starts making sense to look towards a
>> 1.1 release in the next couple of weeks.

You certainly want to add the "ACCEPT_SET" feature to the open items for 
1.1.

We have recently discovered that MOVE_SET has a race condition. If for 
example one has nodes 1, 2 and 3. 1 being origin, 2 and 3 being 
subscribers. If one now does a MOVE_SET to transfer the origin to node 2 
there is a possibility that node 2 processes that MOVE_SET, opens up for 
business and generates SYNC's (so far this is what we want).

If now node 3 is behind in replicating from 1, but keeps well up with 
events from node 2, it will confirm SYNC events coming from node 2 
(assuming "I am not subscribed to anything from there, so nothing to 
do") until it actually has caught up with 1 up to the MOVE_SET event.

The cure for this is a new event type ACCEPT_SET that is generated by 
the new origin when it processes the MOVE_SET event. The payload 
information of ACCEPT_SET is the node id and the event id of the 
MOVE_SET event. When processing an ACCEPT_SET event, the worker thread 
will check if the local node has processed that MOVE_SET from the other 
node. If not, the worker thread will error out and retry in 10 seconds.

In the example above, node 3's worker thread 2 will receive the 
ACCEPT_SET, notice that worker thread 1 hasn't processed the MOVE_SET 
yet, suspend event processing for 10 seconds and retry. Since all 
relevant SYNC events on node 2 happen past the ACCEPT_SET event, nothing 
can get lost any more.


Jan

> 
> I just took a peek on the web site for bugs/feature requests, and realize
> that I have been remiss in not reviewing this in a while.
> 
> Darcy, I have bounced one item in your direction vis-a-vis ./configure
> support (someone had some Digital/OSF changes to request); hope that's
> suitable.
> 
> There are a number of bugs that do need to be peeked at further; some that
> have already been addressed by documentation; a few that I see I have
> implemented and never responded to on the bug/feature list :-).
> 
> I'll see what further I can address.
> 
> Several problems on the bug list have fallen out of problems with the
> "slony_setup.pl" script which appears to have suffered from some bit rot. 
> I'm not sure what to pursue on that; David Fetter did some support of it,
> but it's not clear anyone is still interested.  If it isn't
> supported/supportable, we should probably deprecate/remove it...
> 
> _______________________________________________
> Slony1-general mailing list
> Slony1-general at gborg.postgresql.org
> http://gborg.postgresql.org/mailman/listinfo/slony1-general


-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck at Yahoo.com #


More information about the Slony1-general mailing list