Brian Hirt bhirt at mobygames.com
Thu Nov 29 14:15:35 PST 2007
Hello,

Out of the blue, I started getting an error coming from a replicated  
master / slave database, a sync is failing because of a primary key  
violation.

2007-11-21 00:27:51 CST DEBUG1 cleanupThread:    0.227 seconds for  
delete logs
2007-11-21 00:30:11 CST ERROR  remoteWorkerThread_1: "insert into  
"public"."moby_user_ranking" (moby_user_id,rank,"position",contribution_ 
rating,is_tie) values ('30979','1','1','69091','f');
..
..
a bunch of repeating similar commands for the same table.
..
..
" ERROR:  duplicate key violates unique constraint  
"moby_user_ranking_pkey"
  - qualification was: where log_origin = 1 and (  (
  log_tableid in  
(46,52,1,2,3,212,217,218,228,229,238,239,255,259,260,281,296,5,6,7,8,10, 
11,12,14,15,16,17,19,20,22,23,24,25,18,26,27,28,29,30,31,32,33,34,35,36, 
37,9,38,39,40,41,42,43,44,45,47,48,49,50,51,53,54,55,56,57,58,59,60,62,6 
3,64,65,66,67,68,69,70,71,72,73,74,75,76,77,80,81,83,85,86,87,88,89,90,9 
1,92,93,94,95,98,99,100,101,102,103,104,97,105,106,107,108,109,110,111,1 
12,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,1 
30,131,132,133,135,136,137,138,139,140,141,142,143,144,145,146,147,148,1 
49,150,151,153,155,156,152,154,157,158,159,160,13,161,162,163,164,165,16 
7,168,169,170,4,171,172,174,61,173,175,176,177,78,79,82,84,178,179,180,1 
81,182,183,184,21,96,166,185,186,187,188,189,190,192,134,191,193,194,195 
, 
196,197,198,199,200,201,203,202,204,205,206,207,208,209,210,211,213,214, 
215,216,219,220,221,222,223,224,225,226,227,230,231,232,233,234,235,236, 
237,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,256,257, 
258,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277, 
278,279,280,282,283,284,285,286,287,288,289,290,291,292,293,294,295,297, 
298,299,300,301,302,303,304,305,306,307,308,309,310,311)    and  
(log_xid < '66609974' and "_mobycluster".xxid_lt_snapshot(log_xid,  
'66609954:66609974:''66609954'''))    and (log_xid >= '66609811' and  
"_mobycluster".xxid_ge_snapshot(log_xid,  
'66609811:66609819:''66609811'',''66609817'''))) )
2007-11-21 00:30:11 CST ERROR  remoteWorkerThread_1: helper 1  
finished with error
2007-11-21 00:30:11 CST ERROR  remoteWorkerThread_1: SYNC aborted

I figured I'd drop the table from the set using: SET DROP TABLE  
( ORIGIN = 1, ID = 18).  However, this did not help which is why I am  
sending this request for help.  I don't know why the original problem  
started happening   Replication had only been running for about a 14  
hours before this problem happened.  I successfully switched over  
twice while replicate was running so I could do maintenance on the  
database servers.   Other than that, no scheme changes or  
applications changes were made.

The master no longer shows the dropped table in sl_table, but the  
slave still does.   The denyaccess trigger is still on the slave  
too.   Sync events are still failing because sl_log_2 still has all  
the un-replicated data for tableid 18.      I'd rather find the cause  
of the problem over hacking things to get working, but at this point  
I think i'm shortly going to be forced to drop the whole set ditch,  
replication or at least start over.   Does anyone have any ideas on  
what might cause slony do get into a state like this?

I'm not really intimate with slony, but it seems like there should be  
a better way to drop a table from a set for situations like this.   
Maybe an option that clears pending updates from the log?

Anyway, I hope people have some suggestions on how to proceed.

Thanks,



Brian Hirt
bhirt at mobygames.com





More information about the Slony1-general mailing list