Mon Oct 23 15:27:30 PDT 2006
- Previous message: [Slony1-general] log switchover not completing with 1.2.0
- Next message: [Slony1-general] log switchover not completing with 1.2.0
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
paul cannon wrote: > Our installation of slony 1.2.0 is struggling. > > It started trying to do a log switchover several days ago. Since then, > every 10-20 minutes we get this in the logs: > > NOTICE: Slony-I: log switch to sl_log_2 still in progress - sl_log_1 not truncated > > Between each time, it looks like it's doing a DELETE on both logs and a > VACUUM on sl_log_1. > > sl_log_status on the master node is 3, which from the code means > everything should be going into sl_log_2, right? And yet /both/ logs are > continually growing, aside from the periodic DELETEs. > > What's going on? Is this expected behavior? The way I understand it, > only one log should be growing and the other should be getting cleaned > out as the events in it propogate and are deleted. > > I'm hoping that eventually a log switchover will be able to complete, > but it doesn't look like that's going to happen as long as things are > still getting put into the inactive log. > > There aren't any long-running transactions on any of the nodes. > > Ideas? > > I ran into the problem at > http://gborg.postgresql.org/project/slony1/bugs/bugupdate.php?1591, > before applying my patch as shown there. Could that have messed up the > log switch process and left it only partially switched somehow? > > A simpler patch is thus: cbbrowne at dba2:Slony-I/slony1_1_2_STABLE/src/slon> cvs diff -u remote_worker.c Monday 22:14:36 Index: remote_worker.c =================================================================== RCS file: /usr/local/cvsroot/slony1/slony1-engine/src/slon/remote_worker.c,v retrieving revision 1.124.2.1 diff -c -u -r1.124.2.1 remote_worker.c cvs diff: conflicting specifications of output style --- remote_worker.c 23 Oct 2006 21:57:01 -0000 1.124.2.1 +++ remote_worker.c 23 Oct 2006 22:14:44 -0000 @@ -5508,8 +5508,15 @@ "from %s.sl_log_1 " "where log_origin = '%s' " " and log_xid = '%s' " + " and log_actionseq = '%s' UNION ALL" + "select log_cmddata " + "from %s.sl_log_2 " + "where log_origin = '%s' " + " and log_xid = '%s' " " and log_actionseq = '%s'", rtcfg_namespace, + log_origin, log_xid, log_actionseq, + rtcfg_namespace, log_origin, log_xid, log_actionseq); res2 = PQexec(dbconn, dstring_data(&query2)); if (PQresultStatus(res2) != PGRES_TUPLES_OK) That has the merit of not requiring any variables to communicate which log is which :-). Jan may come out with something else... That data is going to both log tables seems rather curious. According to the logic in slony1_funcs.c: switch(log_status) { case 0: case 2: cs->plan_active_log = cs->plan_insert_log_1; break; case 1: case 3: cs->plan_active_log = cs->plan_insert_log_2; break; default: elog(ERROR, "Slony-I: illegal log status %d", log_status); break; } Based on that, you should indeed only be adding tuples to sl_log_2. Corresponding logic in logswitch_final: -- ---- -- status = 3: sl_log_2 active, cleanup sl_log_1 -- ---- if v_current_status = 3 then -- ---- -- The cleanup thread calls us after it did the delete and -- vacuum of both log tables. If sl_log_2 is empty now, we -- can truncate it and the log switch is done. -- ---- for v_dummy in select 1 from @NAMESPACE at .sl_log_1 loop -- ---- -- Found a row ... log switch is still in progress. -- ---- raise notice ''Slony-I: log switch to sl_log_2 still in progress - sl_log_1 not truncated''; return -1; end loop; raise notice ''Slony-I: log switch to sl_log_2 complete - truncate sl_log_1''; truncate @NAMESPACE at .sl_log_1; perform "pg_catalog".setval(''@NAMESPACE at .sl_log_status'', 1); -- Run addPartialLogIndices() to try to add indices to unused sl_log_? table perform @NAMESPACE at .addPartialLogIndices(); return 2; end if; The reason you should get the log message you're getting is if there are still live tuples in sl_log_1. That's curious; they shouldn't be getting added in by anything (notably if that sequence value is set to 3). Seems to warrant looking at what's in sl_log_1 to see if it can get associated with SYNCs to see why old entries maybe aren't getting cleaned out.
- Previous message: [Slony1-general] log switchover not completing with 1.2.0
- Next message: [Slony1-general] log switchover not completing with 1.2.0
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list