Wed Jan 25 06:13:28 PST 2006
- Previous message: [Slony1-general] sl_log_1 filling
- Next message: [Slony1-general] sl_log_1 filling
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Forgot to mention I did find an error in the logs. This morning I found the slon for the original cluster was down and this error in the logs: 2006-01-24 21:39:59 AST DEBUG2 localListenThread: Received event 1,1042636 SYNC 2006-01-24 21:40:00 AST DEBUG2 syncThread: new sl_action_seq 7947330 - SYNC 1042637 2006-01-24 21:40:02 AST DEBUG2 syncThread: new sl_action_seq 7947334 - SYNC 1042638 2006-01-24 21:40:03 AST DEBUG2 localListenThread: Received event 1,1042637 SYNC 2006-01-24 21:40:03 AST FATAL syncThread: "commit transaction;" - ERROR: simple_heap_update: tuple concurrently updated 2006-01-24 21:40:03 AST DEBUG1 slon: shutdown requested 2006-01-24 21:40:04 AST DEBUG2 slon: notify worker process to shutdown 2006-01-24 21:40:04 AST DEBUG2 slon: wait for worker process to shutdown .... I restarted just fine but my data is still on Nov 28 -----Original Message----- From: Christopher Browne [mailto:cbbrowne at ca.afilias.info] Sent: Tuesday, January 24, 2006 4:44 PM To: Robert Littlejohn Cc: 'slony1-general at gborg.postgresql.org' Subject: Re: [Slony1-general] sl_log_1 filling Robert Littlejohn <Robert.Littlejohn at resolvecorporation.com> writes: > Great, thanks for the info. I've been meaning to get back to this but I've > been out of the office a bit. > > I had already been through the faq (that's why I did the vacuums) but so far > I still can't find anything. The query you posted returned no results. I'm > looking at the test_slony_state-dbi.pl now but so far it only tells me > sl_seqlog, sl_log_1, sl_seqlog all exceed 200000. Some of the tests fail > with perl errors so I'll try to get those test run. > > I've also looked quite a bit at the logs and have found nothing. The > cleanupThread is starting every 5 - 15 minutes and reports things like: > 2006-01-06 12:25:47 AST DEBUG1 cleanupThread: 5.849 seconds for > cleanupEvent() > 2006-01-06 12:26:20 AST DEBUG3 cleanupThread: minxid: 199383042 > 2006-01-06 12:26:20 AST DEBUG4 cleanupThread: xid 199383042 still active - > analyze instead > 2006-01-06 12:39:49 AST DEBUG1 cleanupThread: 3.261 seconds for > cleanupEvent() > 2006-01-06 12:40:08 AST DEBUG1 cleanupThread: 18.638 seconds for delete > logs > > except for the xid 199383042 still active - analyze instead nothing really > jumps out at me. Well, the "analyze instead" part ought to be a reasonably useful optimization. Essentially, if a transaction is running now that was running the last time the cleanup thread was running, then it's futile to try to VACUUM the tables, as no data will get cleaned out; that fairly-old-transaction will hold onto the data. I'm not sure that you necessarily have any problem going on right now. If you have some long-running transactions (and you certainly do), and see hundreds/thousands of database updates per minute, it would be pretty easy for the size of sl_log_1 and sl_seqlog to grow to ~200K. Consider: sl_log_1 contains a row for each tuple that is updated. If you do a transaction per second, each of which involves 10 table updates, that would add, to this table... 60 x 10 = 600 rows per minute If you had *no* long running transactions, then you'd expect the cleanup thread, after 10 minutes, to find, and leave alone, 600 x 10 = 6000 rows that are relevant to the last 10 minutes of activity. If you have some transaction that's running for an hour, then that leads to growth to 36000 rows. If you're doing about 5 transactions per second, rather than 1, that easily gets you to 200K rows in sl_log_1. sl_seqlog gets, for each sync, a row for each sequence that you replicate. If you have 50 sequences, that can grow pretty big pretty easily... I'm sort of doing some "back of the napkin" estimates here just to suggest how you might check to see if the numbers are reasonable or not... If you ever see replication fall behind, if a WAN connection slows up, it would be fully natural to see sl_log_1 grow to: period of time in seconds, plus 10 minutes times expected transactions per second times expected tuples updated per transaction Does that help? -- "cbbrowne","@","ca.afilias.info" <http://dba2.int.libertyrms.com/> Christopher Browne (416) 673-4124 (land)
- Previous message: [Slony1-general] sl_log_1 filling
- Next message: [Slony1-general] sl_log_1 filling
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list