[Slony1-general] primary key for slony

Tue Apr 4 14:19:50 PDT 2006

On Tue, Apr 04, 2006 at 05:07:55PM -0400, Christopher Browne wrote:
> 
> If horarios_general is a table of millions of entries, where thousands
> get updated each hour, there is sense in agonizing over whether or not
> to add a serial column as a surrogate primary key.
> 
> If, on the other hand, it's going to have a few rows modified per day,
> it isn't worth thinking too hard about it, and the difference between
> a single serial PK and a composite of his six columns will be
> immaterial.

Right.  Which is what I guess I said, tersely, the first time: you
might find it'll be faster.  It's something worth testing, under some
circumstances, if you understand what the consequences are.

The important thing to realise here is that this is a function of
several variables:

1.	The cost of (additional) indexes and fields.

2.	The cost of a lookup in a multi-column index (particularly if
	the early-listed columns have poor selectivity, in my
	experience).

3.	The cost of shipping the index criterion on the network.

Many people may not realise the import of (3): when you update or
delete a row to be replicated by Slony, what Slony actually sends on
the wire is a query altering the relevant columns (or deleting the
row) and selecting by the primary key.  If you have widely
distributed nodes with lots of fast disk, but very slow networks
between them, this could be a factor that comes into play.  It's not
likely, I expect, but it's worth considering in your thinking.

In most cases, the multi-column key is an excellent choice.

A

-- 
Andrew Sullivan  | ajs at crankycanuck.ca
I remember when computers were frustrating because they *did* exactly what 
you told them to.  That actually seems sort of quaint now.
		--J.D. Baldwin