Simon Riggs simon at 2ndquadrant.com
Fri Oct 26 08:58:15 PDT 2007
On Fri, 2007-10-26 at 10:44 -0400, Christopher Browne wrote:

> >> > It would be straightforward to remove the (col1, col2, col3) text from
> >> > each INSERT statement since that is optional. That would reduce the
> >> > overhead of each INSERT row and reduce the parsing time on the other
> >> > side also.
> >> 
> >> Very unsafe.  What if the subscriber decides to consider the columns
> >> to be in a different order?  Do we need to go back and ask for the
> >> "reordering columns" feature that periodically pops up?
> 
> FYI, there is a further downside to the removal: It means you cannot
> have any case where you "hack" a subscriber to have an extra column.
> It also breaks the case where you use log shipping to generate a
> temporal database, where there are additional temporal columns.
> 
> >> I see a much bigger win in Jan's idea to use COPY to get sl_log_n
> >> data to the subscriber As Fast As Parsing Allows, and then use rules
> >> on sl_log_n to generate INSERT/UPDATE/DELETE requests on the
> >> subscriber to do the work.  That would take a lot of the load off
> >> the provider, and COPY seems likely to be way faster than other
> >> rewritings.

If removing the list of columns is a problem, how would the COPY idea
work? Or would you supply the list of columns as well as the data? I
take the point that the table definitions may differ.

We already have a list of the tables we replicate, so why not store the
list of columns being replicated also? We need only use the list when
the definitions differ.

Shipping the list of columns means the names all have to be checked,
which causes much additional parse time, as well as additional log table
space and network bandwidth.

So perhaps we should remove the column names as a standalone
optimisation, as well as a precursor to the COPY suggestion.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com



More information about the Slony1-general mailing list