Thu Sep 8 10:43:29 PDT 2011
- Previous message: [Slony1-general] pg_dump and replication lag in 2.0.7
- Next message: [Slony1-general] pg_dump and replication lag in 2.0.7
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 11-09-08 11:43 AM, Glyn Astill wrote: > > SELECT st_origin, st_received, st_lag_num_events, round(extract(epoch from st_lag_time)) > FROM "<my_replication_cluster>".sl_status; > > A graph for the weeks leading up to and after the upgrade is attached. I upgraded on the night of the 25th/26th and ignoring any other downtime where I was obviously fiddling with things, you can see the syncs going out after that date. As you can imagine, I'm massively embarrassed that it took me 3 months to notice it happening. > st_lag_time is a measure of the difference between now() and the last unconfirmed event. The pg_dump locks sl_event which prevents the SYNC's from being created so there might not be any unconfirmed events to be measured by this check. Sometime between 2.0.4 and 2.0.6 we fixed a bug that prevented SYNC events from being generated from pure slaves. I suspect your check is now measuring the other half of replication (if you do your select from sl_status you should see at least two rows, it isn't clear if your graphing both of them or just one). If now()-st_last_event_ts gets too high it means that SYNC events are not being generated. You might want to alert on both SYNC events not being generated and events not being confirmed. > > Glyn
- Previous message: [Slony1-general] pg_dump and replication lag in 2.0.7
- Next message: [Slony1-general] pg_dump and replication lag in 2.0.7
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list