Fri Jun 22 13:58:09 PDT 2007
- Previous message: [Slony1-general] Huge database remote sync issue. Ideas?
- Next message: [Slony1-general] Huge database remote sync issue. Ideas?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 6/22/07, Shaun Thomas <sthomas at leapfrogonline.com> wrote: > Howdy folks, > > We're in the middle of a migration / upgrade, and I've got a giant slony > set in place, and I get no errors on anything, and syncing starts up > just great. But something seems to be weird here: > > 2007-06-21 19:08:44 CDT FATAL cleanupThread: "delete > from "_replication".sl_log_1 where log_origin = '10' and log_xid > < '757377'; delete from "_replication".sl_log_2 where log_origin = '10' > and log_xid < '757377'; delete from "_replication".sl_seqlog where > seql_origin = '10' and seql_ev_seqno < '2'; > select "_replication".logswitch_finish(); " - server closed the > connection unexpectedly > > After it copies a huge amount, say 15-17GB of our 40-45GB total, the > pace slows from about 300MB per minute to 5MB / minute, then to almost > nothing. The remote system we're mirroring to has an idle disconnect > which is likely killing the connection in question, causing a giant > rollback of current progress. The FATAL error above, tells me it's > doing a log switch on Node 10, which makes no sense, since Node 10 is a > slave, and should have no events. This is also the same error I get, > every single time, even though the log_xid number itself may change. > > So my questions: > > 1. Why is log switching on node 10, instead of node 1, which is > providing the data? > > 2. Why is this mysterious log switch stalling the data copy, so our idle > timer slaughters the initial table COPY commands mid-progress. Why do you have an "idle timer" running during a subscribe? Killing slons during subscribe is an outright bad idea. > 3. Is there some way the initial copy can *not* be an "all or nothing" > proposition? 45GB seems an awfully huge first-bite, and it seems > unfair that not a single error or disconnect may occur during the > entire process of copying that much data. Checkpoints? Something? > Maybe a configuration for a heartbeat, anything I missed? Slony doesn't replicate databases. It replicates sets of tables and sequences. To smooth your initial subscribe, have you considered breaking it into a number of small sets (say with a single table and related sequences in each set) and subscribing them piece by piece? Otherwise, the answer is no. Also, you will want to VACUUM (if not TRUNCATE) those tables to get those dead rows out before restarting your slon to try again. > 4. Is it possible to somehow... bootstrap the mirror? Make an exact > data copy of the current database and have slony only copy updates > after a certain point? I mean, I could probably do a dump/restore and > let slony keep everything up to date, before our systems launch the > nightly insert jobs. No. Unless you intend to use log shipped replication. > 5. Something else I didn't consider? > > Thanks in advance. This is driving me nuts and I've scanned through > various documentation without much luck. We're working with our vendor > to temporarily disable to idle kickoff, but there's a chance that may > not be the issue, considering that weird error I pasted always having > the same contents; I'd think the error would be different if it were > just an idle disconnect. Mixing vendors and replication is like putting de lime in de coconut. Good luck with that. Andrew
- Previous message: [Slony1-general] Huge database remote sync issue. Ideas?
- Next message: [Slony1-general] Huge database remote sync issue. Ideas?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list