Christopher Browne cbbrowne
Tue Apr 4 14:07:55 PDT 2006
Scott Marlowe <smarlowe at g2switchworks.com> writes:

> On Tue, 2006-04-04 at 15:32, Andrew Sullivan wrote:
>> On Tue, Apr 04, 2006 at 04:16:12PM -0400, Jan Wieck wrote:
>> > int4/int8 column one. Sure, that scan will be more expensive. But I 
>> > think Andrew forgot that using an artificial serial column means that 
>> > your origin and subscribers must maintain an additional index on all 
>> > insert/update operations (and vacuum must clean it up after deletes).
>> > 
>> > Andrew, do you have actual experience that it will perform better, or 
>> > was that an assumption?
>> 
>> No, I didn't forget; but I probably should have made this clearer. 
>> The real question is what the use patterns are.
>> 
>> For cases where you're likely to change _all_ the columns at once
>> (think INSERT only, for instance), the overhead for the additional
>> index is likely not worth it.
>> 
>> For cases where you insert once, but then some (but not all) of the
>> values change over time, then depending on your hardware, use
>> patterns, and network speeds, the single column might be worth it.  A
>> six-column key where 5 the items are longish pieces of text and one
>> is a status value, for instance, would be not only expensive to look
>> up: it will also be expensive to ship on the network.
>
> Also, if the text values are large enough, they'll overrun the 1/3 of a
> block size limit and your insert / update will fail.

We're getting very excited about the issues surrounding how this works
in the most general, pathological cases.

Of course, Miguel showed us an example which involved a table
consisting of 4 int4 values and two time values.  Which sounds to me
like something where tuples will consist of somewhere around 25-30
bytes of data, so that the index entries oughtn't cause any massive
problems.

If horarios_general is a table of millions of entries, where thousands
get updated each hour, there is sense in agonizing over whether or not
to add a serial column as a surrogate primary key.

If, on the other hand, it's going to have a few rows modified per day,
it isn't worth thinking too hard about it, and the difference between
a single serial PK and a composite of his six columns will be
immaterial.
-- 
let name="cbbrowne" and tld="ca.afilias.info" in String.concat "@" [name;tld];;
<http://dba2.int.libertyrms.com/>
Christopher Browne
(416) 673-4124 (land)



More information about the Slony1-general mailing list