111 – node -1 not found in runtime configuration

Bug 111 - node -1 not found in runtime configuration

Summary: node -1 not found in runtime configuration

Status:	NEW

Alias:	None

Product:	Slony-I
Classification:	Unclassified
Component:	slon (show other bugs)
Version:	devel
Hardware:	PC Linux

Importance:	medium normal
Deadline:	2010-02-04
Assignee:	Slony Bugs List

URL:

Depends on:
Blocks:	10
	Show dependency tree

Reported:	2010-02-01 00:14 UTC by rezuser
Modified:	2010-08-16 07:51 UTC (History)
CC List:	1 user (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description rezuser 2010-02-01 00:14:07 UTC

We get this error very often and one common scenario is,

We have configured a master and slave database clusters in two seperate servers. in the master databse we created two replication sets, one for tables and another for sequences, then we merged the sequence set with tables set. after that we start the slon daemon and we get this error. and there is noway of coming back, unless we restore the whole cluster. Although we subscribe, no replication occurs.

remoteWorkerThread_1: node -1 not found in runtime configuration.
2010-01-25 15:45:18 IST WARN   remoteWorkerThread_1: data copy for set 1 failed - sleep 60 seconds

Please mention the cause of a this problem, or atleast how to overcome this issue without restoring the whole cluster.

Comment 1 Steve Singer 2010-04-23 10:02:28 UTC

See, 
http://lists.slony.info/pipermail/slony1-general/2010-April/010574.html

What we think is happening is that the subscription information for the set on the subscriber is being deleted (ie by an unsubscribe set, but a merge set might be similar?) before the ENABLE SUBSCRIPTION is processed by the slon.  When the event is finally processed the row in sl_subscription has already been deleted.

Comment 2 Jan Wieck 2010-06-09 13:39:39 UTC

Changed version to devel because actually fixing this requires features.

UNSUBSCRIBE SET should continue to be issued against the subscriber. If the event would originate from the set origin, the subscriber must crawl through all the backlog to finally unsubscribe. That is a waste.

The processing of ENABLE_EVENT should on node -1 error simply confirm the event, assuming that the subscription was canceled via UNSUBSCRIBE.

Upon receiving an UNSUBSCRIBE_SET event, the origin of that set will issue yet another UNSUBSCRIBE_SET in order to guard against a possible race condition where a third node, that is a forwarder for the set, receives the initial UNSUBSCRIBE_SET before processing the initial SUBSCRIBE_SET. This would cause it to wrongfully think that the node is actually subscribed to the set.


Jan