Bug 76 - repair config can fail
Summary: repair config can fail
Status: RESOLVED FIXED
Alias: None
Product: Slony-I
Classification: Unclassified
Component: stored procedures (show other bugs)
Version: 1.2
Hardware: PC Linux
: medium normal
Assignee: Slony Bugs List
URL:
Depends on:
Blocks:
 
Reported: 2009-03-10 23:52 UTC by Stuart Bishop
Modified: 2009-04-06 15:33 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stuart Bishop 2009-03-10 23:52:52 UTC
I managed to trigger a failure in repair config, running repair config against a database freshly restored from a pg_dump of a master node.

$ slonik << EOM
> cluster name = sl;
> node 1 admin conninfo = 'dbname=lpmain_staging_new user=slony';
> repair config (set id=1, event node=1, execute only on=1);
> EOM
<stdin>:3: PGRES_FATAL_ERROR select "_sl".updateReloid(1, 1);  -
ERROR:  duplicate key value violates unique constraint
"sl_table_tab_reloid_key"
CONTEXT:  SQL statement "update "_sl".sl_table set tab_reloid =
PGC.oid from pg_catalog.pg_class PGC, pg_catalog.pg_namespace PGN
where "_sl".slon_quote_brute("_sl".sl_table.tab_relname) =
"_sl".slon_quote_brute(PGC.relname) and PGC.relnamespace = PGN.oid and
"_sl".slon_quote_brute(PGN.nspname) =
"_sl".slon_quote_brute("_sl".sl_table.tab_nspname)"
PL/pgSQL function "updatereloid" line 39 at SQL statement


This is easy to replicate by swapping to tab_reloids:

dev=# select tab_id,tab_reloid from sl_table where tab_id in (200,201);
 tab_id | tab_reloid 
--------+------------
    200 |    6672479
    201 |    6672488
(2 rows)

dev=# update sl_table set tab_reloid=42 where tab_id=200;
UPDATE 1
dev=# update sl_table set tab_reloid=6672479 where tab_id=201;
UPDATE 1
dev=# update sl_table set tab_reloid=6672488 where tab_id=200;
UPDATE 1
dev=# \q

$ slonik << EOM
> cluster name = sl;
> node 1 admin conninfo = 'dbname=dev user=slony';
> repair config (set id=1, event node=1, execute only on=1);
> EOM
<stdin>:3: PGRES_FATAL_ERROR select "_sl".updateReloid(1, 1);  - ERROR:  duplicate key value violates unique constraint "sl_table_tab_reloid_key"
CONTEXT:  SQL statement "update "_sl".sl_table set tab_reloid = PGC.oid from pg_catalog.pg_class PGC, pg_catalog.pg_namespace PGN where "_sl".slon_quote_brute("_sl".sl_table.tab_relname) = "_sl".slon_quote_brute(PGC.relname) and PGC.relnamespace = PGN.oid and "_sl".slon_quote_brute(PGN.nspname) = "_sl".slon_quote_brute("_sl".sl_table.tab_nspname)"
PL/pgSQL function "updatereloid" line 39 at SQL statement


Populating the tab_reloid with (guaranteed) gibberish at the start of the repair config would solve the problem.

A similar issue probably occurs with sequences.
Comment 1 Christopher Browne 2009-04-06 15:33:44 UTC
Quite correct...

Change committed to CVS including a regression test that makes sure that REPAIR CONFIG gets run.