Jeff Frost jeff at pgexperts.com
Wed Apr 2 13:05:56 PDT 2014
First, I have a question.  The docs here: http://main.slony.info/documentation/2.1/deploymentconcerns.html#AEN2499

indicate that 0.978 seconds until close cursor means that processing took 0.978 seconds against the provider.

Does that mean this is how long it took to iterate over the entire cursor or that it took this long to iterate over the cursor and perform the actual inserts/updates/deletes on the local node?  I'm guessing it's the latter, but I want to make sure.

Also, I'm not quite clear on how to interpret the two numbers in the large tuples log output.

I ask because we have a 1.2TB database which is being replicated with slony 2.1 and while it usually cruises along just fine, occasionally we will see ever increasing lag on busy days, that takes forever to catch up.

The servers in question are connected via gigabit and bandwidth utilization is quite low on the link. 

The provider has 256G of RAM and the subscriber has 80G of RAM.

sync_group_maxsize is the only non-default value in the slon config and is currently set at 500, though we have tried various higher and lower values with seemingly little effect, probably because once you get over 100 it doesn't make as much difference.

Anyone have further tuning suggestions?

If we look at our sync events in the logs, they look like this:

2014-04-02 11:00:18 PDT DEBUG1 remoteHelperThread_2_2: 0.987 seconds delay for first row
2014-04-02 11:16:59 PDT DEBUG1 remoteHelperThread_2_2: 1001.387 seconds until close cursor
2014-04-02 11:16:59 PDT DEBUG1 remoteHelperThread_2_2: inserts=44690 updates=133632 deletes=23148 truncates=0
2014-04-02 11:16:59 PDT DEBUG1 remoteWorkerThread_2: sync_helper timing:  pqexec (s/count)- provider 2.048/406 - subscriber 0.000/406
2014-04-02 11:16:59 PDT DEBUG1 remoteWorkerThread_2: sync_helper timing:  large tuples 1.315/1787
2014-04-02 11:17:07 PDT INFO   remoteWorkerThread_2: SYNC 5015991733 done in 1009.672 seconds
2014-04-02 11:17:07 PDT DEBUG1 remoteWorkerThread_2: SYNC 5015991733 sync_event timing:  pqexec (s/count)- provider 0.002/2 - subscriber 0.004/2 - IUD 1008.625/40297

2014-04-02 11:17:08 PDT DEBUG1 remoteHelperThread_2_2: 1.094 seconds delay for first row
2014-04-02 11:56:01 PDT DEBUG1 remoteHelperThread_2_2: 2334.126 seconds until close cursor
2014-04-02 11:56:01 PDT DEBUG1 remoteHelperThread_2_2: inserts=83628 updates=122325 deletes=25012 truncates=0
2014-04-02 11:56:01 PDT DEBUG1 remoteWorkerThread_2: sync_helper timing:  pqexec (s/count)- provider 2.213/465 - subscriber 0.000/465
2014-04-02 11:56:01 PDT DEBUG1 remoteWorkerThread_2: sync_helper timing:  large tuples 0.756/965
2014-04-02 11:56:04 PDT INFO   remoteWorkerThread_2: SYNC 5015992233 done in 2337.211 seconds
2014-04-02 11:56:04 PDT DEBUG1 remoteWorkerThread_2: SYNC 5015992233 sync_event timing:  pqexec (s/count)- provider 0.010/2 - subscriber 0.005/2 - IUD 2336.036/46197

2014-04-02 11:56:06 PDT DEBUG1 remoteHelperThread_2_2: 1.520 seconds delay for first row
2014-04-02 12:16:37 PDT DEBUG1 remoteHelperThread_2_2: 1232.814 seconds until close cursor
2014-04-02 12:16:37 PDT DEBUG1 remoteHelperThread_2_2: inserts=109353 updates=124460 deletes=41581 truncates=0
2014-04-02 12:16:37 PDT DEBUG1 remoteWorkerThread_2: sync_helper timing:  pqexec (s/count)- provider 2.663/554 - subscriber 0.000/554
2014-04-02 12:16:37 PDT DEBUG1 remoteWorkerThread_2: sync_helper timing:  large tuples 3.342/4818
2014-04-02 12:16:42 PDT INFO   remoteWorkerThread_2: SYNC 5015992733 done in 1237.954 seconds
2014-04-02 12:16:42 PDT DEBUG1 remoteWorkerThread_2: SYNC 5015992733 sync_event timing:  pqexec (s/count)- provider 0.002/2 - subscriber 0.005/2 - IUD 1236.358/55084

2014-04-02 12:16:44 PDT DEBUG1 remoteHelperThread_2_2: 1.339 seconds delay for first row
2014-04-02 12:25:28 PDT DEBUG1 remoteHelperThread_2_2: 525.458 seconds until close cursor
2014-04-02 12:25:28 PDT DEBUG1 remoteHelperThread_2_2: inserts=82400 updates=132333 deletes=61684 truncates=0
2014-04-02 12:25:28 PDT DEBUG1 remoteWorkerThread_2: sync_helper timing:  pqexec (s/count)- provider 2.351/556 - subscriber 0.000/556
2014-04-02 12:25:28 PDT DEBUG1 remoteWorkerThread_2: sync_helper timing:  large tuples 3.322/5009
2014-04-02 12:25:31 PDT INFO   remoteWorkerThread_2: SYNC 5015993233 done in 528.010 seconds
2014-04-02 12:25:31 PDT DEBUG1 remoteWorkerThread_2: SYNC 5015993233 sync_event timing:  pqexec (s/count)- provider 0.001/2 - subscriber 0.004/2 - IUD 526.604/55287


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 208 bytes
Desc: Message signed with OpenPGP using GPGMail
Url : http://lists.slony.info/pipermail/slony1-general/attachments/20140402/ed25c03f/attachment.pgp 


More information about the Slony1-general mailing list