Bug 217 - Changing the primary key of a replicated table leads to trouble
Summary: Changing the primary key of a replicated table leads to trouble
Status: RESOLVED FIXED
Alias: None
Product: Slony-I
Classification: Unclassified
Component: trigger SPI (show other bugs)
Version: 2.0
Hardware: PC Linux
: high critical
Assignee: Steve Singer
URL:
Depends on:
Blocks:
 
Reported: 2011-05-27 12:33 UTC by Steve Singer
Modified: 2011-06-16 10:19 UTC (History)
1 user (show)

See Also:


Attachments
A function to recreate the trigger with the arguments for the new primary key (1.95 KB, patch)
2011-05-30 08:08 UTC, Steve Singer
Details
patch to have execute script automatically fix tables (3.91 KB, patch)
2011-05-31 08:18 UTC, Steve Singer
Details
updates to the patch for 9.0 support (5.83 KB, patch)
2011-06-03 11:58 UTC, Steve Singer
Details
Add PK test to the set of regression tests typically run (1.52 KB, patch)
2011-06-08 15:05 UTC, Christopher Browne
Details
replace C function to decode trigger arguments with string matching (3.57 KB, patch)
2011-06-10 06:39 UTC, Steve Singer
Details
replace C function to decode trigger arguments (updated) (3.56 KB, patch)
2011-06-10 07:52 UTC, Steve Singer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Steve Singer 2011-05-27 12:33:04 UTC
This bug has been confirmed in slony 2.1.0 but I would expect it to show up in 2.0.x but not 1.2 or earlier.

Say you have a table test1
create table test1 (
       id1 int4 not null,
       id2 int4 not null,
       id3 int4,
       data text
);
create unique index test1_pkey on test1(id1);

insert into test1(id1,id2,id3,data) values (1,1,1,'test');
insert into test1(id1,id2,id3,data) values (2,2,1,'test2');

and you replicate it specifying the unique index pkey1 on the 'set add table..' line.

Then you execute the following script through EXECUTE SCRIPT

drop index test1_pkey;
create unique index test1_pkey on test1(id1,id2);
update test1 set id1=1;

Then on the master you do 
update test1 set data='bar' where id2=2

The update matches 1 row on the master.  However when it runs on the slave slony thinks the primary key is (id1) so it only includes id1 in the where clause.   This means that two rows will be changed on the slave.

The trigger definition arguments contains 'k' initially, and we have no way of updating it to 'kk',  nor does the trigger (today) have any way of knowing (at runtime) that the index has changed.
Comment 1 Christopher Browne 2011-05-27 13:31:22 UTC
I see that a test case for this has been added to HEAD...

http://git.postgresql.org/gitweb?p=slony1-engine.git;a=commit;h=1ff9059f1a25ddc58e7ac729eb5c2ef822ede8cf
Comment 2 Christopher Browne 2011-05-27 13:34:48 UTC
A solution strategy for this may be to create a function that rummages through all replicated tables, checking to see if the triggers have 'kvvkvk' values that correspond to the primary key for each respective table, fixing that, if it should fail to match.

Such a function would be run by EXECUTE SCRIPT, automatically.

It could presumably also be run manually, against each node, if someone has decided to mess with the tables by hand, via psql.
Comment 3 Steve Singer 2011-05-30 07:29:01 UTC
The following query will detect if the primary key/unique index has changed since the log trigger was created.

select  tab_relname from _test.sl_table, pg_trigger where tab_reloid=tgrelid and _test.determineAttKindUnique(tab_nspname||'.'||tab_relname,tab_idxname)!=(string_to_array(tgargs::text,E'\\000'::text))[3];

(where _test is the slony schema name, replace with your slony schema name)
Comment 4 Steve Singer 2011-05-30 08:08:05 UTC
Created attachment 100 [details]
A function to recreate the trigger with the arguments for the new primary key
Comment 5 Steve Singer 2011-05-30 08:16:25 UTC
I have now attached

a) A query to detect tables with modified indexes/pkeys
b) A function to drop the trigger and re-add it with the proper/new arguments (this takes an exclusive lock on the table)

We can modify EXECUTE SCRIPT to check all replicated tables and fix them if they require fixing.  Another approach would be to have EXECUTE SCRIPT only fix tables that the current transaction already has an exclusive lock on.  If the SQL script submitted to execute script actually changed the primary key/constraint on the table then the transaction would already hold that lock.   This condition would mean that EXECUTE SCRIPT still won't obtain any exclusive locks in addition to what the SQL script already has.  The downside is that if a table has the wrong trigger arguments prior to EXECUTE SCRIPT being called (ie the primary key was changed outside of EXECUTE SCRIPT and recreate_log_trigger wasn't called) then the error will remain.
Comment 6 Christopher Browne 2011-05-30 08:45:00 UTC
There exist arguments both ways:

1.  Only running this against tables that already have locks taken out means we're not escalating locks, and we can indeed be confident that any tables that are being modified in *THIS* DDL script will already be locked.

2.  If we don't fix all tables that need fixing, then we're leaving a broken state in place.  The DDL request may demand some locks that it didn't already have, but only against tables that probably can't replicate properly.  Doesn't that seem like a high priority fix???

I'd be inclined to fix all tables that need fixing, otherwise, replication remains broken for those tables.
Comment 7 Steve Singer 2011-05-30 12:27:35 UTC
(In reply to comment #3)
> The following query will detect if the primary key/unique index has changed
> since the log trigger was created.
> 
> select  tab_relname from _test.sl_table, pg_trigger where tab_reloid=tgrelid
> and
> _test.determineAttKindUnique(tab_nspname||'.'||tab_relname,tab_idxname)!=(string_to_array(tgargs::text,E'\\000'::text))[3];
> 
> (where _test is the slony schema name, replace with your slony schema name)

A better version of the query is

select  tab_relname from _test.sl_table, pg_trigger where tab_reloid=tgrelid
and
_test.determineAttKindUnique(tab_nspname||'.'||tab_relname,tab_idxname)!=(string_to_array(tgargs::text,E'\\000'::text))[3] and tgname='_test_logtrigger';

(Note the and tgname='....' to only include the log triggers.)
Comment 8 Steve Singer 2011-05-31 08:18:34 UTC
Created attachment 101 [details]
patch to have execute script automatically fix tables

This patch to be applied ontop of the previous one will have EXECUTE SCRIPT automatically repair logtriggers (that need it) that are already locked by the script.  (this is option (b) in the discussion).  It is a one line change to the patch to make it implement option (a).   The repair_log_triggers function(true/false) could also be called manually by users.
Comment 9 Steve Singer 2011-06-02 13:52:01 UTC
These patches do not work with 9.0

tgargs in pg_trigger is now stored in a different format.

Does anyone know how to decode this? 
In a way that works in all versions of PG?
Comment 10 Steve Singer 2011-06-03 11:58:31 UTC
Created attachment 102 [details]
updates to the patch for 9.0 support

This update to the patch adds a C function that will decode tgargs in 9.0.

It will also print a warning on tables that need repair but are skipped because
no lock was previously obtained on the table.
Comment 11 Christopher Browne 2011-06-08 15:05:10 UTC
Created attachment 105 [details]
Add PK test to the set of regression tests typically run

Here's an additional patch to add in the additional regression test.

FYI, it ran fine for me against 2.0, based on your branch, bug217_20.
Comment 12 Christopher Browne 2011-06-09 09:38:42 UTC
It scares me that decoding trigger arguments required an SPI function.

Actually, poking at pg_trigger suggests that the problem isn't at the Slony level...  I wonder who came up with the notion that the following was reasonable...

slonyregress1@localhost->  select tgnargs, tgargs from pg_trigger where tgnargs > 0;
 tgnargs |                           tgargs
---------+------------------------------------------------------------
       3 | \x5f736c6f6e795f72656772657373310031006b00
       1 | \x5f736c6f6e795f726567726573733100
       3 | \x5f736c6f6e795f72656772657373310032006b00
       1 | \x5f736c6f6e795f726567726573733100
       3 | \x5f736c6f6e795f72656772657373310034006b00
       1 | \x5f736c6f6e795f726567726573733100
       3 | \x5f736c6f6e795f72656772657373310035006b76766b767676766b00
       1 | \x5f736c6f6e795f726567726573733100
(8 rows)

I'd be inclined to feed some griping about this representation up to pgsql-hackers, in the hopes of getting something nicer, maybe?

Not that that's too likely for 9.1 :-)
Comment 13 Jan Wieck 2011-06-09 12:04:01 UTC
Maybe we aren't supposed to query pg_trigger, but query the information_schema.triggers view instead?


Jan
Comment 14 Steve Singer 2011-06-09 12:37:08 UTC
(In reply to comment #13)
> Maybe we aren't supposed to query pg_trigger, but query the
> information_schema.triggers view instead?
> 
> 
> Jan

The information schema gives rows like


 EXECUTE PROCEDURE _slonyregress.logtrigger('_slonyregress', '1', 'k')


we could

a) build up a similar line in the query in the plpgsql function and do a text comparision
b) Do regex matching to extract the 'k' from the arguments
c) use the C function like the patch proposes.

I dislike (b).  I'm not opposed to (a), but comparing just the trigger arguments seems more direct to me, even if it involves a C function to get them
Comment 15 Jan Wieck 2011-06-09 13:52:25 UTC
I would favor a). Although it will lead to a rather cryptic and non-intuitive behavior should the information_schema.triggers view ever change the representation (it didn't since 8.3 and I didn't bother checking earlier versions). What it would tell is that all the unlocked tables have changed but it won't do anything about it. Then again any regexp implementation could behave just the same, so there is not much to gain from that.


Jan
Comment 16 Christopher Browne 2011-06-09 14:29:56 UTC
(In reply to comment #15)
> I would favor a). Although it will lead to a rather cryptic and non-intuitive
> behavior should the information_schema.triggers view ever change the
> representation (it didn't since 8.3 and I didn't bother checking earlier
> versions). What it would tell is that all the unlocked tables have changed but
> it won't do anything about it. Then again any regexp implementation could
> behave just the same, so there is not much to gain from that.

I suggest that if we guard ourselves by having a decent regression test, so that if the view changes, we're quite likely to notice, then yes, a) seems better to me.
Comment 17 Steve Singer 2011-06-10 06:39:39 UTC
Created attachment 108 [details]
replace C function to decode trigger arguments with string matching

This patch to be appplied ontop of the others will remove the C function that decodes the trigger arguments and replaces it with string matching code against information_schema.triggers.

I feel that the C function is cleaner and more future proof.  To get this patch to work I had to be careful about matching spaces and quotes in the function with the format the information schema returns.  I have not yet done a lot of testing and suspect that this version of the patch might break with mixed case slony clusternames (if so that should be fixable).

The C function works and should continue to work with future postgres versions because it uses the same infrastructure to decode tgargs as postgersql uses when it invokes the trigger.  With the string matching approach we are depending on a unit test to tell us that something suttle has changed in postgresql after the fact.
Comment 18 Steve Singer 2011-06-10 07:52:22 UTC
Created attachment 109 [details]
replace C function to decode trigger arguments (updated)

Updating patch to not print extra warning messages + deal better with mixed case slony schemas
Comment 19 Steve Singer 2011-06-14 06:29:03 UTC
(In reply to comment #18)
> Created an attachment (id=109) [details]
> replace C function to decode trigger arguments (updated)
> 
> Updating patch to not print extra warning messages + deal better with mixed
> case slony schemas

It turns out that the fixes to make strings match with mixed case breaks all lower case schema names.

This is another example of why I think the string matching approach is a bad idea.
If you disagree with me then take a look at this then spend some time trying to fix the patch so it works in both with the various versions of Pg.

Otherwise I think we should go back to the C function for decoding these trigger arguments.
Comment 20 Steve Singer 2011-06-16 10:19:14 UTC
The approach with the c function has been committed to both REL_2_0_STABLE for inclusion in 2.0.7.rc2 and master.

http://git.postgresql.org/gitweb?p=slony1-engine.git;a=commit;h=663dc7dc36ca6329d50b5f8bc91faac6932e5cb8
http://git.postgresql.org/gitweb?p=slony1-engine.git;a=commit;h=db8af76a66bc79867ec4845ecf40192cf05c85d6