[Slony1-general] 6 hours to replicate single table, 12 hours to replicate DB

Thu Jan 2 12:33:41 PST 2014

On Thu, Jan 2, 2014 at 2:35 PM, Tory M Blue <tmblue at gmail.com> wrote:

>
> Wondering what settings I need to speed this up. To do a rebuild of a db
> it takes a long time, 6 hours for a singe  table. No I/O issues, no load,
> just slon postgres taking their sweet old time. I would like to use the
> resources available to speed this up.
>
> The table is
>
> 2013-12-21 19:17:58 PST CONFIG remoteWorkerThread_1: Begin COPY of table
> "impressions"
> 2013-12-21 19:37:03 PST CONFIG remoteWorkerThread_1: 12657163552 bytes
> copied for table ”impressions”
> 2013-12-22 01:40:22 PST CONFIG remoteWorkerThread_1: 22944.144 seconds to
> copy table ”impressions” <— 6 hours
>
> Postgres 9.2.4 slony 2.1.3
>
> This is a larger table, but because of bloat etc, we need to do ground ups
> to clean it out every so often (Vacuums don't do it).
>
>
> Slony config , pretty much at default  other than sync interval.
>
> # Check for updates at least this often in milliseconds.
> # Range: [10-60000], default 2000
> sync_interval=1000
> #sync_interval_timeout=10000
> # apply every single SYNC by itself.
> # Range:  [0,100], default: 6
> #sync_group_maxsize=6
> #sync_max_rowsize=8192
> #sync_max_largemem=5242880
>
> I either need some advanced settings for when we are doing a rebuild, to
> speed up the process, or I need to do some configurations that stay during
> normal workloads as well. But normal workloads things are replicated and
> keep in sync, it's just the rebuild portion. I would like to see it
> actually stressing my boxen :)
>
>
The one place where it might be worth modifying configuration is to change
the amount of memory being used for sorts, as the step *after* the COPY of
table "impressions" (which likely takes most of the remaining 6h of the
subscription process) is to reindex that table.  If the reindex takes 4h,
changing the GUC might reduce that 4h significantly.

http://www.postgresql.org/docs/9.2/static/runtime-config-resource.html#GUC-MAINTENANCE-WORK-MEM

Aside from that, there's not too much else to improve in practice, and not
too terribly much even in theory.

We've had discussions (not too lately; I don't think anything relevant has
changed too terribly much) about the idea of doing some multi-threading of
the SUBSCRIBE/COPY process.  The notable idea was to split off the REINDEX
process to a series of extra threads so that it could be done in parallel.
 In that you have one Very Very Large Table, only a limited amount of
benefit could be gotten from that; if the biggest table involves 6h of COPY
and 4h of REINDEX, then the fastest, in theory, that the subscription could
get done is 10h, which probably isn't enough improvement to be worth the
added code + fragility that would result.

The last time the concept was discussed, the conclusion was made to not
implement parallel processing of REINDEX, on the basis that:
a) It would be a fair bit of code, adding potential for new bugs;
b) It would add more potential for lock and deadlock conditions;
c) In the common case where one table's data dominates the database (which
sure seems to be the case for you), there's little benefit as the
theoretical minimum subscription time is *at minimum*
    time-to-COPY biggest table + time to reindex biggest table
which makes the optimization pretty futile.

Your case doesn't seem to invalidate the reasoning.

You could get *some* benefit from splitting things into two sets, one
consisting of the table "impressions" and the other consisting of the rest.
  I'd hazard the guess that there's not too much need to clean up the
tables other than "impressions"; you could drop the 'outage' to the
theoretical minimum if you don't bother re-subscribing the other tables.

I wonder if there's merit to trying to split "impressions" into several
tables, one which would contain the "hot, heavily updated" data, and others
containing "cooked/complete" data that doesn't need to be reorganized.  It
may not be easy to distinguish between data that is "complete" and data
that is still changing a lot.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20140102/7266606e/attachment.htm