Matthew Horoschun matthew
Wed Oct 26 07:22:55 PDT 2005
Hi,

We've been testing slony for a while now and we're seeing the following 
problem after some time successfully replicating:

Oct 25 17:11:55 radius2 slon_radius2[33490]: [28-1] 2005-10-25 17:11:55 
EST [33490] ERROR  remoteListenThread_2: timeout for event selection
Oct 25 17:17:49 radius2 slon_radius2[33490]: [29-1] 2005-10-25 17:17:49 
EST [33490] ERROR  remoteListenThread_2: timeout for event selection
Oct 25 17:23:34 radius2 slon_radius2[33490]: [30-1] 2005-10-25 17:23:34 
EST [33490] ERROR  remoteListenThread_2: timeout for event selection

Obviously, at this stage, replication fails.

So far, our investigation has found that:

* It appears to fail only on clusters with more than one slave.
* It isn't periodic as far as we can tell (it can take a week or two to 
fail).
* We haven't had it fail under load (it's currently failing when 
completely idle -- no changes being submitted to the master).

All nodes are running slony1-1.1.0 on FreeBSD 4.11.

Any suggestions on what might be causing this, or where I should look 
for more useful debugging information?

Matthew.







More information about the Slony1-general mailing list