Mon Jan 7 09:43:53 PST 2008
- Previous message: [Slony1-general] pgpool-II & slony installation question
- Next message: [Slony1-general] Proposal: using COPY to pull sl_log_? data to subscribers
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I have been looking at this idea for a while, and I *think* it has enough merit to consider implementing it. The present way that SYNCs are processed may be outlined thus: - slon opens a LOG cursor to the provider that selects the contents of sl_log_1/sl_log_2 that are relevant to the SYNC set being processed - slon fetches 100 entries at a time, leaving out all Large Tuples (based on configurable parameter) - For each of the 100 entries, the slon generates an INSERT/UPDATE/DELETE, and delivers them to the worker process. In cases where the tuples were Very Large, those tuples get queried individually. - If the subscriber is a logging subscriber, we then also insert the sl_log_? data into the appropriate sl_log_? table on the subscriber. I propose a rather different approach, that uses COPY and a stored procedure. - Rather than opening a cursor, we submit a COPY query against the provider of the form: "COPY (select * from sl_log_1 where [SYNC criteria] union all select * from sl_log_2 where [SYNC criteria]) to stdout;" - We then set up a "COPY [subscriber's_current_sl_log_? table] from stdin;" on the subscriber. - We loop, so long as there is data provided by the first statement, and COPY the data to the subscriber node. - Once the COPY is complete, we run a stored procedure on the subscriber node, telling it the same [SYNC criteria] given above, and this stored procedure runs through the data in sl_log_? on the subscriber in actionseq order. For each row, it will: - build an insert/update/delete statement - execute that statement I can already see several straightforward improvements and tradeoffs in this, as compared with what we have now. (Hopefully none of it involves overly baroque plans with obvious fatal errors! - see below!) 1. Some processing load gets taken off the provider The "LOG cursor" query becomes a "COPY [foo] to stdout" query, and it's worth noting that we lose the need to have an ORDER BY clause, which eliminates a sort. 2. Processing load is moved from slon to subscriber DBMS The present slon loads sl_log_? tuples into memory, and the C code generates the INSERT/UPDATE/DELETE. In the proposed method, the stored procedure does that work, on the subscriber DBMS. An important question: Will that loop lead to grossly excessive backend memory usage in cases where Large Tuples are processed? (e.g. - where the INSERT statement is inserting a tuple consisting of 40MB of data) 3. We use COPY to load data onto the subscriber There are two very large benefits to this, in that: i) COPY should be *WAY* faster than the INSERT presently used; ii) We can COPY in specific-sized-buffer chunks, which eliminates the somewhat-overly baroque code that tries to limit slon memory usage. There is a downside: with this approach, we now have no option for a subscriber node to NOT be configured to be a provider; all nodes now load data into the sl_log_? tables. 4. The "shape" of lag times, on subscribers, changes somewhat. Right now, there is a sort of "linear" behaviour; if a SYNC is half-done, then that means that - half of the tuples needing updates have been updated; - half of the sl_log_? copies have been loaded. With the new behaviour, we load all of the data into sl_log_? before doing anything about applying changes into the replicated tables. I'm not sure if this one matters at all. 5. There is the new potential to untie loading sl_log_? from processing it. That is, we could, in principle, make those processes asynchronous, with separate COMMITs. De-synchronizing log application from log copying adds the interesting result that we probably cut down on the potential for data loss in case of loss of an origin node. Let us suppose, for sake of argument (and I'm making up numbers!) that present processing takes 2.0s per SYNC, and that the new approach takes 0.15s to copy sl_log_?, and 1.5s to apply sl_log_?. Given those (made-up) numbers, the new approach means that that SYNC is vulnerable to loss for only 0.15s, rather than the 2.0s of the past. Further, if we did enough tracking to do this, it ought to be possible to set up a replication node which consists *solely* of the Slony-I schema. You'd have a small database whose contents would "churn" pretty heavily, but where it would likely all be cached. 6. Replication is applied based on subscriber node configuration Thus, you could rename tables on a subscriber-by-subscriber basis. I think that's more fitting... There's enough interaction with worker threads that this would represent a fairly significant code change to Slony-I, and the number of changes in behaviour warrant community discussion. My belief is that it's a worthwhile change, even considering the tradeoffs (notable one being #3 - you can't have "subscribe only" nodes that don't record sl_log_? data). Jan has observed that there is something of a "downside" in that we do *not* gain, in any of this, the ability to use prepared statements (or similar) against the replicated tables. He's keen on creating some new approach where the triggers capture something looking more like the underlying data structure of the tuple, and where the action, on the subscriber, involves updating the table down nearer to the "storage level" deeper inside PostgreSQL. I'm not sure that's actually an argument against doing this. -- let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;; http://www3.sympatico.ca/cbbrowne/ Evil Overlords tend to get overthrown due to overly baroque plans with obvious fatal errors. Follow the "Rules of the Evil Overlord," and you need not fear heroic opposition, whether that hero be James Bond, Flash Gordon, or a little hobbit named Frodo.
- Previous message: [Slony1-general] pgpool-II & slony installation question
- Next message: [Slony1-general] Proposal: using COPY to pull sl_log_? data to subscribers
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list