Tue Jul 3 14:52:33 PDT 2007
- Previous message: [Slony1-general] Soliciting ideas for v2.0
- Next message: [Slony1-general] Soliciting ideas for v2.0
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Jeff Davis wrote:
> On Tue, 2007-07-03 at 12:33 -0400, Christopher Browne wrote:
>
>> I'll see about doing an experiment on this to see if, for the DELETE
>> case, it seems to actually help. It may be that the performance
>> effects are small to none, so that the added code complication isn't
>> worthwhile.
>>
>>
>
> In a simple test I ran, DELETE of the entire 5M record table using
> sequential scan was MUCH faster (9.41s) than 5M individual DELETE
> statements in a single transaction (552.49s).
>
> 5M records is small enough to fit into memory. I expect the difference
> would be even greater when the index and table can't both fit into
> memory and the deletes are distributed randomly over the table.
>
> I think it is worth exploring ways of solving this problem. Right now
> slony is great for small inserts, updates, and deletes. But any large
> update/delete on the origin can cause the subscribers to fall way
> behind.
>
The "handy" alternative test (which would be a good "smoke test" for
whether it's worth bothering to put *any* effort into this) would be to
try to do some partial groupings to see if they'd help.
Thus, if the whole delete goes across the range id = 0 thru id =
5000000, then things to try would be (each case involving 1 transaction):
1. Delete with the 5M individual DELETE statements. (Which you found
took 552.49s)
2. Delete with 50K DELETE statements, each having a WHERE clause with
100 items in it.
3. Delete with 5K DELETE statements, each having a WHERE clause with 1K
items in it.
If 2. or 3. come *way* closer to 9.41s, then it may be worth exploring
the complexity of folding together adjacent deletes on the same table.
There could also be a case made for trying sequential versus random
orderings (e.g. - in the former case, each DELETE statement takes on a
specific range of items whereas in the latter, each selects items more
or less at random).
I'll see about constructing a series of tests like this; won't be
running before I send this :-). If you have time to generate 2. and/or
3., on your system and get timings there, I'd be much obliged.
;;;; Here's some relevant code :-)
(format t "begin;")
(loop for i from 0 to 49999
do (format t "delete from foo where ")
(loop for j from 0 to 99
do (format t "id=~D or " (+ j (* i 100))))
do (format t "id=~D;~%" (* i 100)))
(format t "commit;")
I don't think we can save the full 543 seconds, but if we could save a
good portion of it, it's worth trying to pursue...
>> Remember, the thought we started with was:
>> "What if we could do something that would make mass operations less
>> expensive?"
>>
>> I don't want to introduce anything that can materially increase
>> processing costs.
>>
>> The more intelligent we try to get, the more expensive the
>> logtrigger() function gets, and if the price is high enough, then we
>> gain nothing.
>>
>> The only "win" I see is if we can opportunistically join some
>> statements together. If we have to make the log trigger function
>> universally *WAY* more expensive, well, that's a performance loss :-(.
>>
>
> Is there any way that we could detect (even without 100% confidence)
> that a transaction is "big" and we should spend more effort trying to
> optimize it?
>
>
Regrettably, no. For us to switch over to a sort of log trigger that
supports "doing something smarter" requires that we add in logic that
will have some (definitely non-zero) cost any time it *isn't*
worthwhile. And "usual sorts of OLTP activity" will fall into the
category where performance would be injured.
> I know that's a big project (in the general case), but there might be
> some simple things that would work.
>
Well, the cases I suggested (2. and 3.) would fall into "simple cases".
- Previous message: [Slony1-general] Soliciting ideas for v2.0
- Next message: [Slony1-general] Soliciting ideas for v2.0
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list