Christopher Browne cbbrowne
Tue May 30 14:31:07 PDT 2006
Wayne Conrad <wconrad at yagni.com> writes:
> Hello,
>
> I'm making slony die, but I don't know why.
>
> I've got a around/about 60GB database that I want to replicate.
> Partway through the replication, the slony log shows that postgres
> gave a "stack depth limit exceeded" error.  The query that errors out
> has a "where log_actionseq not in (...)" clause with about 320,000
> numbers in it, which is what is making postgres cranky.
>
> Debian testing
> Slony1 1.1.5-1
> Postgresql 8.1.3-4
>
> The database is 55GB and around 60M rows.  It is being written to
> during the replication, to the tune of perhaps 300MB / 300K-rows per
> day.
>
> Replication had taken place for many hours before this error occured.
>
> Here's the error:
>
> 2006-05-21 06:41:53 MST ERROR  remoteWorkerThread_1: "declare LOG cursor for select     log_origin, log_xid, log_tableid,     log_actionseq, log_cmdtype, log_cmddata from "_wayne_production".sl_log_1 where log_origin = 1 and (  (
>  log_tableid in (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68)
>     and (log_xid < '15254334' and "_wayne_production".xxid_lt_snapshot(log_xid, '15253431:15254334:''15254330'',''15254332'',''15253431'',''15254214'',''15254270'''))
>     and (log_xid >= '14383966' and "_wayne_production".xxid_ge_snapshot(log_xid, '14383966:15038780:''14383966''')) and log_actionseq not in ('1729615','1729616', ... approx 320,000 numbers ... ,'2048151','2048152')
> ) ) order by log_actionseq; " PGRES_FATAL_ERROR ERROR:  stack depth limit exceeded
>
> Thanks for any help you can offer.  If you need more clues or
> experiments, just ask.

Hmm.  I added a function a while back that will tend to shorten those
lists to minscule size...

----------------------------------------------------------------------------
Revision 1.91: download - view: text, markup, annotated - select for diffs
Wed Oct 26 21:45:52 2005 UTC (7 months ago) by cbbrowne
Branches: MAIN
Diff to: previous 1.90: preferred, colored
Changes since revision 1.90: +231 -6 lines

On first SYNC, need to compress log_actionseq not in (possibly huge list)
into a set of "not (log_actionseq between a and b or log_actionseq between c and d ...)"

If the SUBSCRIBE_SET event runs for a very long time, there can be a whole lot
of events on the go between the time it starts and that first SYNC.  If so,
the list of log_actionseq values can consist of thousands if not greater
orders of magnitude of items.

compress_actionseq() takes the "huge list" (a string of comma delimited values)
and has a state machine to parse out the numbers, in order.

Each number is compared to the current 'range'...
- If it's inside or extends a range, we continue working on a "between a and b" clause
- The value might be a singleton, thus adding a "log_actionseq <> x" clause

What generally happens if the event runs for a long time is that there will
be some really enormous sequences of consecutive values.  Happily, the
new function shrinks that into "between a and b" (where b = a + 50000), which
means we have a query that's 4K in size rather than 40MB in size.

Hannu Krosing added logic at the end that handles the case where there
was NOTHING in the set of log_actionseq values.
----------------------------------------------------------------------------

You'll find that this helps cut the log_actionseq clause down to size...

OK, that's in HEAD, and NOT in 1.1.5.

The patch in 1.91 might be able to be applied against 1.1.5,
maybehaps...
-- 
output = ("cbbrowne" "@" "ca.afilias.info")
<http://dba2.int.libertyrms.com/>
Christopher Browne
(416) 673-4124 (land)



More information about the Slony1-general mailing list