Raghav ragavendra.dba at gmail.com
Mon Apr 28 02:19:49 PDT 2014
Thanks Jan for your reply.


> Let me make sure that I understand the issue correctly. You say that your
> master database has a corrupted index, which allowed duplicate keys to be
> inserted


Correct


> . This means that without removing those duplicate rows, you could not
> REINDEX the table on the master either. Replication fails because you do
> not have the same index corruption on the replica(s). Correct me if I
> misunderstood.
>
>
Not exactly. Sorry, I didn't described it well. My point is not about
fixing the corrupted index on the master node which anyways required to
continue syncing. But I want slony not to repeat the same thing in its
catalogs. Due to corruption one of the relation has accepted a duplicate
row, however being healthy the sl_log_* has accepted the duplicate row
thats where am worried and it has crashed the replication. Later removing
duplicate entry, slony smartly started without any hiccups thats a
wonderful area am impressed. Let me put into stages from crash to fix.

1. Assume, at this stage Slony replication going fine between master/slave.
2. Now, on master node index got corrupted on one of the replicating table.
(Due to some reason).
3. Master node allowed duplicate record due to corruption.
4. Same record copied in sl_log_* ( which mean two records of same value)
5. On Slave, event fails to apply and SYNC got aborted. (No corrupted pk
index on slave, so it won't allow duplicates here).

This is what happened in my case. When I check master/slave db logs for the
time of sync abort there are only DML entries. But the slony_log was just
showing duplicate key violation error and then aborted. By digging more, I
found the details in sl_log_* for two entries.

How I fixed is ?

6. Stop slon on master/slave
7. Remove duplicate row from master node (Remember slave node will not have
duplicate since the PK index is fine and won't allow)
8. Remove duplicate row from sl_log_*
9. REINDEX table on master.
10. Start slon.


> Slony log tables are capturing INSERT, UPDATE, DELETE and TRUNCATE
> operations. Except for the TRUNCATE, all those operations are captured on
> the row level. These insert operations that led to duplicate keys did
> happen, even though they should not have. But Slony has A) no way of
> detecting that and B) it isn't Slony's duty to second guess the integrity
> of the master database.
>
>
I completely agree. If am not expecting too much here, how about adding
this check in slony to handle such scenarios too ?.

On same corrupted table, I implemented a stupid trick, just to avoid
duplicates in sl_log_* and it worked. Check this out,

postgres=# create unique index isl_log_1 on _rf.sl_log_1(log_cmdargs);
CREATE INDEX
postgres=# create unique index isl_log_2 on _rf.sl_log_2(log_cmdargs);
CREATE INDEX
postgres=# insert into dtest values (1,'D');
INSERT 0 1
postgres=# insert into dtest values (1,'D');
ERROR:  duplicate key value violates unique constraint "isl_log_1"
DETAIL:  Key (log_cmdargs)=({id,1,name,"D         "}) already exists.
CONTEXT:  SQL statement "INSERT INTO _rf.sl_log_1 (log_origin, log_txid,
log_tableid, log_actionseq, log_tablenspname, log_tablerelname,
 log_cmdtype, log_cmdupdncols, log_cmdargs) VALUES (1,
"pg_catalog".txid_current(), $1, nextval('_rf.sl_action_seq'), $2, $3, $4,
$5, $6); "
postgres=# select * from dtest ;
 id |    name
----+------------
  1 | D
(1 row)

postgres=# select * from _rf.sl_log_1 ;
 log_origin | log_txid | log_tableid | log_actionseq | log_tablenspname |
log_tablerelname | log_cm
------------+----------+-------------+---------------+------------------+------------------+-------
          1 |    36882 |           2 |         15007 | public           |
dtest            | I
(1 row)

Not similar, but something control like this will help slony continue even
in case of corrupted index and it warns the user at database level itself.

--Raghav
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20140428/63927e7e/attachment.html 


More information about the Slony1-general mailing list