Mon Jan 29 17:37:08 PST 2007
- Previous message: [Slony1-general] slon_kill incorrectly choosing daemons to kill?
- Next message: [Slony1-general] Quizzical merge set error
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I recently upgraded to slony 1.2.6. We were rehearsing a database schema upgrade for a two node slony cluster and came across an error at the end. We need to avoid having the execute script do its exclusive locks with the application talking to the database, so we do it using the following dance: 1. Turn off slon daemons 2. Switch application to use the secondary (read-only of course) 3. Run the upgrade script on the primary using execute script and add tables & sequences into a new set and merge waiting for subscriptions to be confirmed (which blocks). 4. Switch the application back to the primary db 5. Turn the slon daemons back on (which unblocks #3). All was well until step #5 when we got this error: <stdin>:12: PGRES_FATAL_ERROR select "_radio".mergeSet(1, 9999); - ERROR: Slony-I: set 9999 has subscriptions in progress - cannot merge To provide more details, here's what we actually run in step 3 (minus the ddl): CREATE SET (ID = 9999, ORIGIN = 1, COMMENT = 'Temporary set for add and merge'); ...Execute DDL via EXECUTE SCRIPT... SET ADD TABLE (SET ID = 9999, ORIGIN = 1, ID = 39, FULLY QUALIFIED NAME = 'public.new_table1'); SET ADD SEQUENCE (SET ID = 9999, ORIGIN = 1, ID = 18, FULLY QUALIFIED NAME = 'public.new_seq'); SET ADD TABLE (SET ID = 9999, ORIGIN = 1, ID = 40, FULLY QUALIFIED NAME = 'public.new_table2'); WAIT FOR EVENT (ORIGIN = ALL, CONFIRMED = ALL, TIMEOUT = 0); SUBSCRIBE SET (ID = 9999, PROVIDER = 1, RECEIVER = 2, FORWARD = yes); WAIT FOR EVENT (ORIGIN = ALL, CONFIRMED = ALL, TIMEOUT = 0); MERGE SET (ID = 1, ADD ID = 9999, ORIGIN = 1); The cluster we are upgrading has two nodes, id 1 (origin) and id 2. So we explicitly wait for the ADDs to complete before SUBSCRIBE and we also wait for the SUBSCRIBE to complete before the MERGE (assuming I understand WAIT properly, which I may not). So we get this error which seems to indicate that the SUBSCRIBE was not in fact complete before the MERGE was executed, does that make sense to anyone? Note that step #3 above did indeed block as I would expect until we turned the slon daemons back on, so the WAITs were doing something. The slon logs look relatively uninteresting and predictably have trouble subsequent to this because the new tables in the unmerged set cannot be found: 2007-01-29 16:41:56 PST ERROR remoteWorkerThread_1: Could not find table "public"."new_table1" on subscriber Any insights are appreciated. I'll be trying to reproduce this in a bit more isolated environment too. -Casey
- Previous message: [Slony1-general] slon_kill incorrectly choosing daemons to kill?
- Next message: [Slony1-general] Quizzical merge set error
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list