[Slony1-general] Replication interruption
Jan Wieck
jan at wi3ck.info
Fri Oct 22 21:28:44 UTC 2021
On 10/22/21 3:33 PM, Sung Hsin Lei via Slony1-general wrote:
> This question may have been asked already. I have noticed that if it
> takes 10 days for the initial data transfer to complete, if the transfer
> is interrupted after 9 days, the transfer resumes from day 1. In other
> words, I need 10 days of uninterrupted connection between the master and
> slave machine for replication to be established properly. Is this the
> case or did I mess up somewhere?
This is correct. One "set" (of tables) must copy over in one transaction
to have a consistent snapshot for replication to start catching up. In
other systems this is often called "CDC" or "Change Data Capture".
Without the initial copy being done in one transaction, the system has
no way of knowing what incremental changes have to be replicated.
You may be able to break up the whole process into multiple steps by
creating multiple sets. All the small (usually referenced PK tables)
first in one set. Then create a set for one or few tables at a time and
replicate them, after which you would do a MERGE SET. Rinse, repeat.
>
> Another question. On a new replication(empty tables on the slave), I see
> the initial transfer stuck on TRUNCATE for a long time for big tables.
> Sometimes, it truncates the same table several times. I remember getting
> an explanation for why it is so. Is there a way to make the initial
> transfer faster(avoiding the truncate)?
A TRUNCATE taking a long time on an empty table sounds wrong. Are you
sure this table is empty or do you assume it is logically empty because
a previous COPY has failed but the system is actually making sure that
none of its foreign key references are violated by the TRUNCATE operation?
--
Jan Wieck
More information about the Slony1-general
mailing list