Peter Davie Peter.Davie
Fri Dec 9 03:44:01 PST 2005
Hi Jan,

Just to add to the nightmare... Even though some queries are cursor 
based, the *postgres process* can still run out of memory when 
performing queries (I have seen this happen with slony).

Thanks,
Peter

Jan Wieck wrote:

> On 12/8/2005 9:31 AM, cbbrowne at ca.afilias.info wrote:
>
>>> On 12/7/2005 9:23 PM, Peter Davie wrote:
>>>
>>>> Hi All,
>>>>
>>>> Using Slony1 version 1.1.0 at a customer site, the customer has had 
>>>> the
>>>> slon daemons fall over on one of their slave servers (and didn't
>>>> notice!) On restarting the slon processes, there is now an error being
>>>> generated because it is attempting to malloc memory to record all 
>>>> of the
>>>> outstanding transactions and the slon daemon is running out of memory.
>>>> Is there any way forward to resolve this, or will I just have to
>>>> uninstall the slave and resubscribe (which is my current plan).
>>>
>>>
>>> This node must have been down for quite some time. A SYNC event in the
>>> remote_worker queue takes about 200 bytes or so. How many million 
>>> events
>>> is this node behind? You could tell from looking at sl_status.
>>>
>>> And don't forget to VACUUM FULL ANALYZE that database after you've
>>> dropped that node.
>>
>>
>> Based on the symptoms, two things come to my mind:
>>
>> 1. Did the slon controlling the origin die? That would be the classic
>> way for a SYNC to encompass a Very Long Period Of Time and hence a 
>> LOT of
>> transactions.
>
>
> That's not the case and it wouldn't cause the symptom observed. Unless 
> there are large rows involved, the resulting, humungous sync chunk 
> would just take a while, but since that operation is cursor driven 
> even in 1.0, it won't cause slon to run out of memory.
>
>>
>> There's a script in ~/tools that will generate SYNCs if you run it as a
>> cron job. We run this in production so as to avoid this particular
>> problem...
>>
>> 2. Is it possible that the subscriber is trying to process a whole bunch
>> of SYNCs in one fell group?
>>
>> If you add the "-g 1" option, it'll go one SYNC at a time, which would
>> somewhat alleviate the problem.
>
>
> Would not be a problem either.
>
>
> Jan
>


-- 
               Relevance... because results count

Relevance                       Phone:  +61 (0)2 6241 6400
A.B.N. 86 063 403 690           Fax:    +61 (0)2 6241 6422
Unit 11,                        Mobile: +61 (0)417 265 175
Cnr Brookes & Heffernan Sts,    E-mail: Peter.Davie at relevance.com.au
Mitchell ACT 2911               Web:    http://www.relevance.com.au

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://gborg.postgresql.org/pipermail/slony1-general/attachments/20051209/2ac92378/attachment-0001.html


More information about the Slony1-general mailing list