Tue Feb 8 15:42:34 PST 2005
- Previous message: [Slony1-general] Figuring out replication is finished / replicas are same
- Next message: [Slony1-general] Figuring out replication is finished / replicas are same
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Postgres Learner wrote: >Hi all! >I am a relatively new user to slony, though I do have some experience >with postgres. I need to set slony up in a special production >environment. Basically our production database remains idle for most >of the day and does some extremely heavy processing in batches for >around 30 mins/day. > >We need to set up a replicated database to failover in case something >goes wrong. So I was thinking about using slony. Since we need to >process our batches REALLY REALLY fast, I was thinking of stopping the >slony daemon while we process the batches and restart after we are >done. This works fine - I set it up and slony seems to continue >replication from where it left off when I shut the daemons. -Please >tell me if this is wrong. > >Now the problem is that I can't figure out how to measure the time >taken to replicate/resync after slony daemon restarts. Basically I >want to know the window after which one server can safely go down >without causing problems. I looked into the documentation and even >searched on the net, but couldn't find anything. Please point me in >the right direction. > > Vivek has been suggesting some useful ideas as to determining if things are back up to date. I'll suggest another; what you could do is to inject a change right at what you consider to be "the end," and then check to see if that change has propagated yet. If it has, then all the previous changes have made it thru, and you should be good to shut things down. What we have traditionally done in our environment to get a sort of end-to-end test that replication is working is to add a "replication_test" table, replicate it, and update that table periodically and check to see if the updates make it through. Alternatively, if you have some sort of transaction ID or batch ID, you might check on the subscriber to see if the last one found on the origin has made it thru. The other issue worth thinking about is whether or not this usage "abuses" the replication system such that it would be considered "poor usage." I do have a couple of thoughts... 1. You should leave the slon running against the origin node all the time, including when it is undergoing the heavy processing. If you shut it off, you'll discover that all of the changes during that 30 minute period are treated as one really big SYNC, and things will behave badly when you turn on the "subscriber" slon as it tries to grab all the data at once. If you instead leave it on, there will likely be hundreds-->thousands of SYNCs during that 30 minutes, and turning replication back on will "play better." Alternative to running that slon is to look in CVS HEAD for "generate_syncs.sh"; run that as often as possible (I'd do so at least once per minute, and would prefer more often than that) during the peak time. It won't cost much, performance-wise, and will cut down on the grief at the end of the 30 minutes. 2. You should try to keep the slons running most of the time so that the systems are largely kept in sync and so that there is not a large buildup of rows in sl_log_1 and sl_seqlog. If you shut off replication for days at a time, those tables will often build up, and performance will be questionable.
- Previous message: [Slony1-general] Figuring out replication is finished / replicas are same
- Next message: [Slony1-general] Figuring out replication is finished / replicas are same
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list