Jan Wieck JanWieck at Yahoo.com
Thu Jun 14 19:00:37 PDT 2007
On 6/14/2007 11:03 AM, Jan Wieck wrote:
> On 6/13/2007 7:55 PM, Steven Singer wrote:
>> On Wed, 13 Jun 2007, Christopher Browne wrote:
>> 
>> 
>> I've downloaded the tarball can compiled against a 8.3 snapshot on Linux and 
>> I'm still getting periodic failures of the testddl unit test.
>> 
>> Usually its a <stdin>:9: timeout exceeded while waiting for event confirmation
> 
> Not sure where this one is coming from. There seems to be only one case 
> of using WAIT FOR EVENT in that test and that does not have a timeout (I 
> must obviously be missing something).

Yepp, I was missing something. WAIT FOR EVENT has a timeout default of 
600 seconds. Which of course means that if node 3 never starts to 
subscribe because of having received the ENABLE_SUBSCRIPTION from node 2 
instead of node 1 (which depends on a race condition), then this is 
exactly what is going to happen.

I have fixed the bug in the copy_set() logic but still need to fix the 
rebuildListenEntries() madness.


Jan

> 
>> 
>> I keep seeing messages along the line in the slon.2.log like:
>> 2007-06-13 19:25:20 EDT WARN   remoteWorkerThread_1: copy set: data provider 
>> 1 only on sync -1 - sleep 5 seconds
>> 
>> Is this the problem Jan is working on or is it something else?
>> Has anyone else tried to run the testddl unit test a handful of times with 
>> better luck?
> 
> I wasn't up to now ;-)
> 
> The problem here seems to be that the sl_listen is entirely messed up 
> which as a side effect triggers what appears to be a bug.
> 
> Why rebuildListenEntries() generates 10 entries for a 3 node setup for 
> sure needs some investigation. The real underlying problem is that slon 
> at copy_set() time looks from where it got the event and if that node is 
> not the one providing the data, it checks if it has seen a confirmation 
> of the ENABLE_SUBSCRIPTION from the data provider. The thing here is 
> that data provider is actually the event origin and nodes don't confirm 
> their own events. So the check needs to exclude the case where the event 
> origin is the data provider.
> 
> 
> Jan
> 


-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck at Yahoo.com #


More information about the Slony1-general mailing list