Tue Jan 16 07:41:14 PST 2007
- Previous message: [Slony1-general] sl_log_1 not cleaning out
- Next message: [Slony1-general] sl_log_1 not cleaning out
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Thomas Pundt <mlists at rp-online.de> writes: > | A node with id = 3 lags 17 days behind the origin. While that node > | is lagging, all events are kept in the log. > | > | So you need to either let that node catch up or remove it from the > | cluster. When that is completed, the log will be cleaned after about > | 20 minutes. > > yes, that's clear; what I don't know is what I can do to let the node catch > up with the rest. I suspect that somehow maybe a confirmation event got lost. > Node 3 seems to be up to date itself: > > This is the master (node 1): > [pg81 at pgmaster:~] echo 'select count(1) from "_RPO".sl_event;' | psql > RPONLINE -p5481 > count > -------- > 935314 > (1 row) > > This is from node 3: > [pg81 at pgmaster:~] echo 'select count(1) from "_RPO".sl_event;' | psql > RPONLINE -p5481 -hpgdb2 > count > ------- > 1050 > (1 row) > > What I'd like to know is: is there anything (except rebuilding node 3) I can > do to get rid of the old events? It's not at all clear from this whether or not node 3 is behind or not. On the origin node run the query: select * from "_RPO".sl_status; If *that* reports node 3 as being way behind, then that's a pretty good indication that node 3 is way behind. One possibility come to mind: We've got one environment where the network is a little bit flakey; occasionally a node will cease to successfully pass back confirmations. It's replicating fine; it's just not reporting that it is. Restarting the slon processes cleared out old connections and brought back sanity. It's easy enough (and safe enough) to stop and restart slon processes; you might try doing that and see if that clears some blockage of confirmation events. Otherwise, if node #3 is honest-to-goodness Way Behind, then the thing to investigate is what's up with that. That's causing sl_log_1 to build up in size as a side-effect; don't worry about the size; that'll fix itself after you fix what's broken about node #3. Maybe you need to drop node 3 or unsubscribe its sets, and rebuild it; that's not obvious at this point. Check the logs for node 3 to see what *is* up with it... -- (reverse (concatenate 'string "ofni.sailifa.ac" "@" "enworbbc")) <http://dba2.int.libertyrms.com/> Christopher Browne (416) 673-4124 (land)
- Previous message: [Slony1-general] sl_log_1 not cleaning out
- Next message: [Slony1-general] sl_log_1 not cleaning out
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list