[Slony1-general] Processing of SYNC from origin node

Wed Sep 19 10:36:31 PDT 2007


Jan Wieck wrote:
> On 9/11/2007 2:49 AM, Cyril SCETBON wrote:
>>
>> Jan Wieck wrote:
>>> On 9/10/2007 4:33 PM, Cyril SCETBON wrote:
>>>>
>>>> Cyril SCETBON wrote:
>>>>>
>>>>>
>>>>> Jan Wieck wrote:
>>>>>> On 9/7/2007 9:36 AM, Cyril SCETBON wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I got this configuration                Node1 --> Node2 (5 
>>>>>>> seconds late)
>>>>>>>                                                           |
>>>>>>>                                                           --> 
>>>>>>> Node3 (2 hours late)
>>>>>>>
>>>>>>> Node2 is processing each SYNC from Node3 and Node2, but Node3 is 
>>>>>>> processing each SYNC from Node2 but not from Node1 which is the 
>>>>>>> origin of the sets :
>>>>>>>
>>>>>>> On Node3 we see  `grep processing 
>>>>>>> /var/log/slony1/node3-pns_profiles_preprod.log|awk '{print 
>>>>>>> $5}'|sort|uniq -c`
>>>>>>>      19 remoteWorkerThread_1:
>>>>>>>     963 remoteWorkerThread_2:
>>>>>>>
>>>>>>> On Node2 we see `grep processing 
>>>>>>> /var/log/slony1/node2-pns_profiles_preprod.log |awk '{print 
>>>>>>> $5}'|sort|uniq -c`
>>>>>>>    1570 remoteWorkerThread_1:
>>>>>>>     865 remoteWorkerThread_3:
>>>>>>>
>>>>>>> Why is there so many SYNC not processed on Node3 ???
>>>>>>>
>>>>>>> Node3 got 22440 queue event and 25 Received event from 
>>>>>>> remoteWorkerThread_1, while Node2 got 4467 queue event and 1578 
>>>>>>> Received event from the same worker.
>>>>>>>
>>>>>>> Is there something to do ?
>>>>>>
>>>>>> How about looking for some error messages?
>>>>> None.
>>>> I've put slon in debug level 2
>>>>>>
>>>>>> What comes to mind would be that sl_event is grossly out of shape 
>>>>>> and that the event selection times out.
>>>>> Seems vacuuming sl_log_1 takes too much time cause of 
>>>>> vacuum_cost_delay and that selecting from this table use a seq 
>>>>> scan. I'm investiguating.
>>>> I forced vacuum to go faster and checked slon logs of subscribers. 
>>>> They got similar disks capabilities which seems to be the 
>>>> bottleneck on all node (wait io ~=50% in vmstat).
>>>>
>>>> I found replication tasks time are different :
>>>>
>>>> On node 3 :
>>>>                      delay in seconds = 585.974ms
>>>>                      cleanupEvent in seconds = 9.25167s
>>>>
>>>> On node 2 :
>>>>                      delay in seconds = 37.6463ms
>>>>                      cleanupEvent in seconds = 0.203265s
>>>>
>>>> May these times explain why node 3 is late compared to node 2 ? 
>>>> What do you think I have to investiguate now ?
>>>
>>> Considering that node 2 can pretty well keep up but node 3 is 
>>> falling way behind, the problem cannot be caused by node 1. Neither 
>>> can it be caused by the event selection of node 3, so that leaves us 
>>> with either the log selection done by node 3 against the data 
>>> provider node 2, or the actual speed of node 3 itself.
>>>
>>> In debug level 2, what does node 3's slon usually report as "delay 
>>> for first row" when processing SYNC events?
>> that's what I gave as 'delay in seconds' above
>
> OK, so the origin can provide log rows almost instantaneously, while 
> node 2 has apparently some issues with the same. Although half a 
> second isn't a catastrophe, it indicates that there are some 
> performance issues handling the overall workload already on that system.
>
> Now when in comes to node 3, this means that it is not doing any 
> actual replication work for 500ms per sync group. Which should not 
> pose a real problem. So my guess is that node 3 is simply too slow to 
> keep up with the write load of the origin, or that the network 
> connection is too slow to actually deliver the log data fast enough. 
> If this is a WAN connection (which by itself can explain 500ms for the 
> first FETCH of 100 log rows), you might want to try using an ssh 
> tunnel with compression.
Although I use SSH compression, it's not better. On The Provider there 
are 400 write/s, do you think it should be worth to increase 
SLON_DATA_FETCH_SIZE to 500 or 1000 in the remote_worker to improve my 
performance ? Network latency (18ms for a ping to another geographical 
site vs 0.2 ms on the same geographical site)
I got 1024 tables that are spread in 64 sets, I was thinking too that 
maybe spreading this 64 sets into 2 databases on the same host would 
improve performance by using 2 differents slony clusters on the same 
machine. So, smaller sl_log_? and 2 differents slon daemons (1 by 
cluster) to take care of the replication.

>
> The other thing to check is to make sure all databases are tuned.
both hosts can serve more than 500 w/s

-- 
Cyril SCETBON