Simon Riggs simon at 2ndquadrant.com
Mon Jan 28 08:36:07 PST 2008
On Mon, 2008-01-28 at 10:41 -0500, Christopher Browne wrote:
> I don't see a benefit in dividing this process up; if it's a
>       "win" to do it on the subscriber, for the heavily updated
>       table(s), then should it not also be an improvement for the
>       lightly-updated tables?

The logic for doing 1 table-in then 1 table-out is very simple and Slony
already does exactly that during copy_set(). The data is reused
immediately without needing to unbuffer/rebuffer before sending, as you
would need to do if everything was in one stream. 

So I was thinking about re-factoring the copy_set() logic slightly to
allow a 1-table in/1-table out function that could then be called by
both copy_set() and during a normal sync.

ISTM doing 1-table at a time would be simpler and faster, and would
probably address the main cause of wanting to use COPY in the first
place: a few INSERT-heavy tables.

Thinking slightly more off-the-wall, maybe Slony could read large tables
directly, rather than logging them first. That would avoid lots of
additional code to support the "per-table log tables" described above.
Avoiding the logging table completely would also be a great performance
boost for Slony.

The sl_log columns are
	log_origin			int4,
	log_xid				@NAMESPACE at .xxid,
	log_tableid			int4,
	log_actionseq		int8,
	log_cmdtype			char,
	log_cmddata			text
If we do this for a single table then tableid and origin are known and
rows already have xmin on them. We do this *only* for INSERTs so cmdtype
is known and the rest of the row is the cmddata. So all we would need to
do is add an actionseq column onto a table to allow it to be its own log
table. We can then COPY the table directly, minus the actionseq column.

That way we just need to tell Slony which tables are "self-logging",
possibly by adding a new status col to sl_table. You'd need to build an
index on the table columns (xmin, actionseq) but thats fine if you use
the right btree operator class, as we already do for sl_log. 

Deletes and Updates would go through the normal route. We would use a
normal logging table if the data is being relayed.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com



More information about the Slony1-general mailing list