[Slony1-general] prepare clone failure

Mon Dec 7 18:48:05 PST 2015

On 12/07/2015 09:25 PM, Josh Berkus wrote:
> On 12/07/2015 11:32 AM, Josh Berkus wrote:
>> On 12/07/2015 10:56 AM, Josh Berkus wrote:
>>> So, the prepare clone method above worked perfectly twice.  But then we
>>> tried to bring up a new node as a prepared clone from node 11 and things
>>> went to hell.
>>
>> One thing I just realized was different between the first two,
>> successful, runs and the failed runs:  the first two times, we didn't
>> have pg_hba.conf configured, so when we brought up slony on the new node
>> it couldn't connect until we fixed that.
>>
>> So I'm wondering if there's a timing issue here somewhere.
>
> So, this problem was less interesting than I thought.  As it turns out,
> the sysadmin was handling "make sure slony doesn't start on the server"
> by letting it autostart, then shutting it down.  In the couple minutes
> it was running, though, it did enough to prevent finish clone from working.
>

I wonder if there is more going on here

In remoteWorker_event

We have

	if (node->last_event >= ev_seqno)
	{
		rtcfg_unlock();
		slon_log(SLON_DEBUG2,
				 "remoteWorker_event: event %d," INT64_FORMAT
				 " ignored - duplicate\n",
				 ev_origin, ev_seqno);
		return;
	}

	/*
	 * We lock the worker threads message queue before bumping the nodes last
	 * known event sequence to avoid that another listener queues a later
	 * message before we can insert this one.
	 */
	pthread_mutex_lock(&(node->message_lock));
	node->last_event = ev_seqno;
	rtcfg_unlock();

It seems strange to me that we are obtaining the mutex lock after 
checking node->last_event.
Does the rtcfg_lock prevent the race condition making the direct 
message_lock redundent? If not do we need to obtain the 
node->message_lock before we do the comparision?

The CLONE_NODE handler in remote_worker sets last_event by calling 
rtcfg_getNodeLastEvent which obtains the rtcfg_lock but not the message 
lock.

The clone node handler in remote_worker seems to do this
1. call rtcfg_storeNode (which obtains then releases the config lock)
2. calls cloneNodePrepare_int()
3. queries the last event id
4. calls rtcfg_getNodeLastEvent() which would re-obtain then release the 
config lock

I wonder if sometime after step 1 but before step 4 a remote listener 
queries events from the new node and adds them into the queue because 
the last_event hasn't yet been set.

Maybe cloneNodePrepare needs to obtain the message queue lock at step 1 
and hold it until step 4 and then remoteWorker_event needs to obtain 
that lock a bit earlier