hannu at skype.net hannu
Tue Oct 19 23:12:59 PDT 2004
>> When shutting the database engines down, the associated "slon"
>> processes tend to exit too.
>>
>> My preference is that when the database is stopped/started (or
>> during a network outage) that the daemon remains running, and
>> attempts periodic reconnects.
>>
>> What is the suggested approach for managing the daemon processes
>> so that replication starts up again when the database becomes
>> available?
>
> There's a watchdog process written in Perl in the "altperl" directory
> (recent CVS, whether for STABLE or for HEAD) that has some capability to
> do this.

I have a case where I have 3 computers

db1 runs postgres slave
db2 runs postgres master and masters slon
db3 runs slaves slon

after a network outage af a few minutes which disconnects db1 (slave pg)
from the rest, the following happens

slon on db3 dies (with Timeou message in log)
slon on db2 keeps running but one of the threads is in R state
        and keeps eating all CPU it can.

I have to manually start slon on db3 and kill main slon and restart it on
db2 to be back to normal ops.

Unfortunalely these are production machines so I can't attach gdb to the
CPU-eating slon to see what's up

Can your perl script handle this situation too ?

----------------
Hannu







More information about the Slony1-general mailing list