[Slony1-general] slony not replicating after re-initializing the slave cluster

Thu Sep 16 16:12:47 PDT 2010

On Thu, 16 Sep 2010, Brian Fehrle wrote:

What are your slon processes logging/printing?   Are the remoteWorker 
threads actually processing events or is the remoteListener the only thread 
logging?  Are rows the events in sl_event being marked as confirmed in 
sl_confirm (on the slave? is the data making it back to the sl_confirm on 
the master? I suspect not otherwise sl_log_1 wouldn't keep growing)

Also remember to run REPAIR CONFIG 
(http://www.slony.info/documentation/stmtrepairconfig.html) after restoring 
from a pg_dump.

> Hi all,
>    Due to realizing that our 1 master -> 1 slave slony cluster had
> different encodings on each box, we attempted to fix that. Our master
> had encoding of LATIN1 and our slave had the encoding of SQL_ASCII (they
> were initialized so long ago, we don't know who did it or why it was
> done that way).
>    Slony worked with this setup, but we wanted to fix it, due to some
> other problems, by moving the slave from SQL_ASCII to LATIN1.
>
>    So we brought down the slon daemons, brought down the slave database
> and rebooted the physical machine the slave is on (dozens of cron jobs
> we commented out and wanted to verify they were all dead).
>
>    When we rebooted the machine, we brought the slave postgres cluster
> online and preformed a pg_dump on the entire database (including the
> _slony schema). Then we brought down the postgres cluster, ran initdb to
> create a new one with LATIN1 encoding, brought the new cluster online
> and ran a pg_restore on it with the dump file we created before.
>
>    After that we restarted our cron jobs, which also started up the two
> slon daemons, we started monitoring the slave and noticed that no
> updates are being applied. We're running the slon daemons with -s 60000
> (force a sync every 60 seconds) and a -x flag to get some slony logs for
> log shipping. These slony logs that are generated with -x are empty
> (they have the slony header and footer, but no insert data).
>
>    On the master, if I do a # select * from _slony.sl_status; I get
> back that there are anywhere between 0 - 2 events, and a lag time no
> greater than 3 minutes. Monitoring the slave slony log output also
> verifies that events are being receved and processed without error every
> minute.
>
>    Again, on the master, # select count(*) from _slony.sl_log_1;
> returns with 12,000 + rows, and it continually grows. So from what I can
> tell, the master is getting events qued up, but not pushing them in the
> events to the slave, each event is completely void of data, and it looks
> like sl_log_1 just keeps building up.
>
>    One theory is that even though we have an exact data dump of the old
> slave cluster restored to to the new slave cluster, since the encoding
> has changed perhaps the master doesn't recognize the slave as the same
> slave it had before. If thats the case, is there any way we can get it
> to recognize it without having to rebuild the slony cluster? (rebuilding
> the cluster would mean a few days of work if not week/s).
>
>    Other than that, I'm unsure what to make of this. I've restarted the
> daemons, and neither the master nor the slave daemon report any errors
> in the logs. I verified that the triggers exist on the master as they
> should (we never touched the master anyways, but still checking
> everything), the path to the slave remained the same as the previous
> slave (same dbname, host, port, user).
>
>    Any thoughts or things I can check would be appreciated. Or if my
> theory about the master not recognizing the new slave cluster as the old
> one is correct, then if we can fix that that would be great.
>
> thanks in advance,
>                      Brian F
> _______________________________________________
> Slony1-general mailing list
> Slony1-general at lists.slony.info
> http://lists.slony.info/mailman/listinfo/slony1-general
>