Mon Jan 7 09:43:53 PST 2008
- Previous message: [Slony1-general] pgpool-II & slony installation question
- Next message: [Slony1-general] Proposal: using COPY to pull sl_log_? data to subscribers
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I have been looking at this idea for a while, and I *think* it has
enough merit to consider implementing it.
The present way that SYNCs are processed may be outlined thus:
- slon opens a LOG cursor to the provider that selects the contents
of sl_log_1/sl_log_2 that are relevant to the SYNC set being
processed
- slon fetches 100 entries at a time, leaving out all Large Tuples
(based on configurable parameter)
- For each of the 100 entries, the slon generates an
INSERT/UPDATE/DELETE, and delivers them to the worker process.
In cases where the tuples were Very Large, those tuples get queried
individually.
- If the subscriber is a logging subscriber, we then also insert the
sl_log_? data into the appropriate sl_log_? table on the subscriber.
I propose a rather different approach, that uses COPY and a stored
procedure.
- Rather than opening a cursor, we submit a COPY query against the
provider of the form:
"COPY (select * from sl_log_1 where [SYNC criteria] union all
select * from sl_log_2 where [SYNC criteria]) to stdout;"
- We then set up a "COPY [subscriber's_current_sl_log_? table] from stdin;"
on the subscriber.
- We loop, so long as there is data provided by the first statement, and
COPY the data to the subscriber node.
- Once the COPY is complete, we run a stored procedure on the
subscriber node, telling it the same [SYNC criteria] given above, and
this stored procedure runs through the data in sl_log_? on the subscriber
in actionseq order. For each row, it will:
- build an insert/update/delete statement
- execute that statement
I can already see several straightforward improvements and tradeoffs
in this, as compared with what we have now. (Hopefully none of it
involves overly baroque plans with obvious fatal errors! - see below!)
1. Some processing load gets taken off the provider
The "LOG cursor" query becomes a "COPY [foo] to stdout" query,
and it's worth noting that we lose the need to have an ORDER BY
clause, which eliminates a sort.
2. Processing load is moved from slon to subscriber DBMS
The present slon loads sl_log_? tuples into memory, and the C
code generates the INSERT/UPDATE/DELETE.
In the proposed method, the stored procedure does that work,
on the subscriber DBMS.
An important question: Will that loop lead to grossly excessive
backend memory usage in cases where Large Tuples are processed?
(e.g. - where the INSERT statement is inserting a tuple
consisting of 40MB of data)
3. We use COPY to load data onto the subscriber
There are two very large benefits to this, in that:
i) COPY should be *WAY* faster than the INSERT presently used;
ii) We can COPY in specific-sized-buffer chunks, which eliminates
the somewhat-overly baroque code that tries to limit slon
memory usage.
There is a downside: with this approach, we now have no option
for a subscriber node to NOT be configured to be a provider; all
nodes now load data into the sl_log_? tables.
4. The "shape" of lag times, on subscribers, changes somewhat.
Right now, there is a sort of "linear" behaviour; if a SYNC is
half-done, then that means that
- half of the tuples needing updates have been updated;
- half of the sl_log_? copies have been loaded.
With the new behaviour, we load all of the data into sl_log_?
before doing anything about applying changes into the replicated
tables.
I'm not sure if this one matters at all.
5. There is the new potential to untie loading sl_log_? from
processing it.
That is, we could, in principle, make those processes
asynchronous, with separate COMMITs.
De-synchronizing log application from log copying adds the
interesting result that we probably cut down on the potential
for data loss in case of loss of an origin node.
Let us suppose, for sake of argument (and I'm making up
numbers!) that present processing takes 2.0s per SYNC, and that
the new approach takes 0.15s to copy sl_log_?, and 1.5s to apply
sl_log_?.
Given those (made-up) numbers, the new approach means that that
SYNC is vulnerable to loss for only 0.15s, rather than the 2.0s
of the past.
Further, if we did enough tracking to do this, it ought to be
possible to set up a replication node which consists *solely* of
the Slony-I schema. You'd have a small database whose contents
would "churn" pretty heavily, but where it would likely all be
cached.
6. Replication is applied based on subscriber node configuration
Thus, you could rename tables on a subscriber-by-subscriber
basis.
I think that's more fitting...
There's enough interaction with worker threads that this would
represent a fairly significant code change to Slony-I, and the number
of changes in behaviour warrant community discussion.
My belief is that it's a worthwhile change, even considering the
tradeoffs (notable one being #3 - you can't have "subscribe only"
nodes that don't record sl_log_? data).
Jan has observed that there is something of a "downside" in that we do
*not* gain, in any of this, the ability to use prepared statements (or
similar) against the replicated tables. He's keen on creating some
new approach where the triggers capture something looking more like
the underlying data structure of the tuple, and where the action, on
the subscriber, involves updating the table down nearer to the
"storage level" deeper inside PostgreSQL. I'm not sure that's
actually an argument against doing this.
--
let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;;
http://www3.sympatico.ca/cbbrowne/
Evil Overlords tend to get overthrown due to overly baroque plans with
obvious fatal errors. Follow the "Rules of the Evil Overlord," and
you need not fear heroic opposition, whether that hero be James Bond,
Flash Gordon, or a little hobbit named Frodo.
- Previous message: [Slony1-general] pgpool-II & slony installation question
- Next message: [Slony1-general] Proposal: using COPY to pull sl_log_? data to subscribers
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list