[Slony1-general] Slave can't catch up, postgres error 'stack depth limit exceeded'

Tue Jan 24 01:57:11 PST 2012

Le 22 janvier 2012 17:16, Steve Singer <steve at ssinger.info> a écrit :
> On Sun, 22 Jan 2012, Brian Fehrle wrote:
>
>> Hi all,
>>
>> PostgreSQL 9.1.2
>> Slony 2.1.0
>
> Set max_stack_depth in your postgresql.conf to something higher.
>
> sync_group_maxsize in your slon.conf to something low MIGHT help (ie 1 or 2)
> but I think the default in 2.1 is pretty low anyway (like 20).

Immediate workaround is in fact to increase max_stack_depth ( but max
it to (ulimit -s minus 1MB)

but ... isn't it slony which should not use more than
default_stack_size ? can't there be an underlining bug ?

>
>
>
>
>
>>
>> I am having some trouble getting a slon node caught up on events. It's a
>> larger database, 350 or so Gigs, and I added a node to a replication set
>> and while it was doing the initial sync, the server that the slon
>> daemons were running on died. It wasn't until about 5 hours later we got
>> the daemons running on a different node and it restarted (i assume it
>> restarted) the initial sync.
>>
>> From what I can tell, it finished the initial sync, however now it's
>> unable to catch up due to the following error line (reduced in size,
>> don't know how many elements there actually were but the single line had
>> about 18 million characters):
>> 2012-01-22 04:43:07 EST ERROR  remoteWorkerThread_1: "declare LOG cursor
>> for select log_origin, log_txid, log_tableid, log_actionseq,
>> log_cmdtype, octet_length(log_cmddata), case when
>> octet_length(log_cmddata) <= 1024 then log_cmddata else null end from
>> "_myslonycluster".sl_log_1 where log_origin = 1 and log_tableid in
>> (2,3,4,5,6,7,1,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122)
>> and log_txid >= '34299501' and log_txid < '34311624' and
>> "pg_catalog".txid_visible_in_snapshot(log_txid, '34311624:34311624:')
>> and (  log_actionseq <> '2474682'  and  log_actionseq <> '2403310'  and
>> log_actionseq <> '2427861'  and
>> <SNIP, repeated many thousands of times with different numbers>
>> '  and  log_actionseq <> '2520797'  and  log_actionseq <> '2519348'
>> and  log_actionseq <> '2485828'  and  log_actionseq <> '2523367'  and
>> log_actionseq <> '2469096'  and  log_actionseq <> '2520589'  and
>> log_actionseq <> '2414071'  and  log_actionseq <> '2391417' ) order by
>> log_actionseq" PGRES_FATAL_ERROR ERROR:  stack depth limit exceeded
>>
>> I found someone with a similar(ish) issue back in the day, and a
>> function called compress_actionseq was mentioned. I turned up debugging
>> to level 4 and see that it is indeed compressing the actionseq, and I
>> looked at the code and it also looks like the above output IS the
>> compressed sequence.
>>
>> Now, this seems to be a tricky setting to tweak on postgres, so I'd
>> rather not unless I had to. So my thoughts were to hopefully just force
>> slony to try to do smaller syncs at a time. I tried reducing (and for
>> the heck of it increasing) the group size, desired_sync_time,
>> sync_max_rowsize, and sync_max_largemem. However nothing has altered the
>> size of this query that is being executed on the database.
>>
>> Any thoughts, suggestions? The initial sync of slony takes about 14
>> hours, so I'd rather not drop the node and re-attach it. In fact I have
>> two nodes in the same issue, stuck at the same event, so I'd rather just
>> get them both synced up without doing another initial sync.
>>
>> Also, I toyed with the idea of forcing slon daemon to only sync up to a
>> specific event, in hopes to do blocks of say 500 events, however the
>> quit_sync_finalsync parameter is not accepted correctly by slony 2.1.0.
>> (I've submitted a email to this list about this too).
>>
>> Thanks in advance,
>> - Brian F
>> _______________________________________________
>> Slony1-general mailing list
>> Slony1-general at lists.slony.info
>> http://lists.slony.info/mailman/listinfo/slony1-general
>>
>
> _______________________________________________
> Slony1-general mailing list
> Slony1-general at lists.slony.info
> http://lists.slony.info/mailman/listinfo/slony1-general

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation