hannu at skype.net hannu
Wed Sep 29 07:16:26 PDT 2004
> On 9/28/2004 6:45 PM, hannu at skype.net wrote:
>> I've got a problem:
>>
>> One of my 24/7 databases has access requirements / performance
>> constraints, which make it impossible for me to do the standard
>> initial hot-join (copy) on some multi-million row tables as it messes
>> up pg and os caches in a way that makes standard production queries
>> too slow.
>
> Whow, this is the first time I see somebody complaining that the initial
> COPY is working too fast :-)
>
> Anyhow, I do see your problem ... your DB suffers from cache eviction
> due to sequential scan. This should go away once you are on 8.0 with a
> properly configured ARC, but I assume you don't have 8.0BETA in
> production.

I was hoping for 8.0 to come out before this problem hit us too hard, but
obviously the it did not ;(

I even thought of applying dev patches for this problem to 7.4

>> Is there any way I can either
>>
>> 1) make the initial copy less agressive - can I somehow throttle
>> it or perform it in smaller chunks ?
>
> The code that does the COPY is in src/slon/remote_worker.c in the
> function copy_set(). What it does is to execute a COPY table TO STDOUT
> on the provider and a COPY table FROM STDIN on the subscriber. Then it
> loops and copies over the data stream. You could try and insert some
> usleep() into that loop.

Thanks, I may try this.

>> or
>>
>> 2) avoid it alltogether - could I load regular backup and start
>> replication from there - somehow pg_dump seems to do fine ?
>
> No, you can't avoid it. And this is the part or your email that I don't
> understand. How can pg_dump do fine?

I think the fact that (dayly) pg_dump is ok, is due to it being run at the
time of least activity. And that it manages to run fast enough run fully
in that time.

> It should even be worse because it
> doesn't get slowed down by the subscriber building indexes on the fly.

Could we somehow speed up our initial copy by dropping PK and other
indexes during COPY and rebuilding them later ? I tried to do it manually
after setting up replication but before subscribing but subscription also
checked for existing pk.

> It is not that both databases are on the same physical box, right?

No, they are on two separate boxes.

------------
Hannu



More information about the Slony1-general mailing list