Thu Jul 12 11:02:29 PDT 2007
- Previous message: [Slony1-general] Soliciting ideas for v2.0
- Next message: [Slony1-general] Soliciting ideas for v2.0
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, 2007-07-10 at 01:12 -0400, Jan Wieck wrote: > The speed gain in your test has a couple of sources. The parsing > overhead of 5 million update statements vs one copy is one of them, the > network latency for 5 million round trips another and the execution plan > (which only applies if all or at least a substantial part of the entire > table is updated) of course. Sure is one such mega update of the entire > table better done in a merge over the two sorted tuple sets. But the > planner might chose another strategy if you updated only 500,000 out of > 10 million rows. The time it takes to do 5M "SELECT #" calls, where # is a constant. These are run over a local socket, not a real network, so the latency is less: 305s The time it takes to do 5M "UPDATE" queries on a 5M record table, and none of the WHERE clauses match any tuples (so only an index lookup is done), in one transaction: 623s The time it takes to do 5M of the same update queries, but prepared first: 398s I know tests involving this overhead have been done before, but this is on my machine, and it could be a reference point for the other numbers I stated earlier. > application). Using prepared statements to apply the changes will deal > with some part of the parsing and planning overhead. It will still force > your case to do 5 million index scans instead of two sorts and a merge. > But that comparison really only applies to cases where you update a very > substantial part of the whole table. Yes, my test was designed to show that there are areas of potential improvement without moving to statement-based replication. > The log is selected by actually doing a COPY over a SELECT (the usual > log select). That COPY result is fed into the current log table on the > subscriber. A trigger on that log table will suppress the actual insert > operation if the subscriber is not in forwarding mode and the operation > is for a subscribed table (the node might be origin to something else). > What it also does is doing the actual leg work of applying the changes > via prepared SPI statements or maybe even direct heap and index updates. > This method cuts down on the parsing and planning, as well as on the > network round trips. > Interesting idea. I don't think I understand the implications of doing direct heap/index updates. Would you have to log the visibility information as well? Regards, Jeff Davis
- Previous message: [Slony1-general] Soliciting ideas for v2.0
- Next message: [Slony1-general] Soliciting ideas for v2.0
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list