[Slony1-general] data copy for set 1 failed 3 times

Thu Nov 29 13:39:38 PST 2012

On Thu, Nov 29, 2012 at 1:31 PM, Scott Marlowe <scott.marlowe at gmail.com>wrote:

> On Thu, Nov 29, 2012 at 2:27 PM, Tory M Blue <tmblue at gmail.com> wrote:
>
> >>>      > From: Tory M Blue <tmblue at gmail.com <mailto:tmblue at gmail.com>>
> >>>
> >>>      >To: slony1-general <slony1-general at lists.slony.info
> >>>     <mailto:slony1-general at lists.slony.info>>
> >>>      >Sent: Wednesday, 28 November 2012, 18:35
> >>>      >Subject: [Slony1-general] data copy for set 1 failed 3 times -
> >>>     sleep 60 seconds
> >>>      >
> >>>      >
> >>>      >Greetings
> >>>      >
> >>>      >I've just brought up a replication node across the state, we have
> >>>     gig circuits, but still going over the net over a vpn tunnel. So
> >>>     there is some delay,
> >>>      >
> >>>      >I'm getting these errors and can't get the initial replication to
> >>>     finish
> >>>      >
> >>>      >4273435:2012-11-27 16:45:17 PST WARN   remoteWorkerThread_1: data
> >>>     copy for set 1 failed 3 times - sleep 60 seconds
> >>>      >
> >>>
> >>>     Is that the only line you get in your logs?  If so use a higher
> >>>     setting for log_level (like log_level=4).
> >>>
> >>>     Also is there anything in the postgresql log to indicate the
> problem?
> >>>
> >>>
> >>> I get the following
> >>>
> >>> 2012-11-29 08:34:38 PST CONFIG remoteWorkerThread_1: 3858.988 seconds
> to
> >>> copy table "cls"."listings"
> >>> 2012-11-29 08:34:38 PST CONFIG remoteWorkerThread_1: copy table
> >>> "cls"."customers"
> >>> 2012-11-29 08:34:38 PST CONFIG remoteWorkerThread_1: Begin COPY of
> table
> >>> "cls"."customers"
> >>> 2012-11-29 08:34:38 PST ERROR  remoteWorkerThread_1: "select
> >>> "_admissioncls".copyFields(8);"
> >>>
> >>
> >> What is the structure of the table with table_id=8 (ie set add
> table(id=8,
> >> fully qualified name=?????)
> >>
> >> If you do
> >> manually run
> >> select _admissioncls".copyFields(8);
> >> from psql, what does it come back with?
> >>
> >>
> > So ran it again with debug = 4, same failure same spot. Should be be
> dying
> > due to the logswitch, but I guess it could be?
> >
> > Failed again, this time with heavier debug, but not showing me much
> (level
> > 4)
> > 1235574-2012-11-29 12:22:12 PST CONFIG remoteWorkerThread_1: Begin COPY
> of
> > table "cls"."customers"
> > 1235665-2012-11-29 12:22:12 PST ERROR  remoteWorkerThread_1: "select
> > "_admissioncls".copyFields(8);"
> > 1235759:2012-11-29 12:22:12 PST WARN   remoteWorkerThread_1: data copy
> for
> > set 1 failed 1 times - sleep 15 seconds
> > Followed sometime later by this
> > 2012-11-29 12:22:28 PST DEBUG2 remoteWorkerThread_2: forward confirm
> > 3,5001168772 received by 4
> > 2012-11-29 12:22:28 PST INFO   copy_set 1 - omit=f - bool=0
> > 2012-11-29 12:22:28 PST INFO   omit is FALSE
> > And it starts all over again.
> >
> > Postgres logs
> >
> > 2012-11-29 12:19:40 PST    HINT:  Consider increasing the configuration
> > paramete
> > r "checkpoint_segments".
> > 2012-11-29 12:22:13 PST admissionclsdb postgres [local] NOTICE:  Slony-I:
> > Logswi
> > tch to sl_log_2 initiated
> > 2012-11-29 12:22:13 PST admissionclsdb postgres [local] CONTEXT:  SQL
> > statement
> > "SELECT "_admissioncls".logswitch_start()"
> >     PL/pgSQL function "cleanupevent" line 96 at PERFORM
> >
> > Nothing in /var/log/messages or dmesg, this is not a system thing it
> doesn't
> > appear. Slony configurations sync rate, or something is putting the
> kabosh
> > on this.
>
> Is there anything useful in the postgresql logs from the same time?
> Like maybe lost connections or crashing backends?
>

Nothing, the postgres logs are clean, other than the checkpoint warnings,
nothing, really absolutely nothing. This appears to be slony centric.

The only other thought on this, is that this remote db is 9.1.6 and the
local (master) is 9.1.4, but same slony versions 2.1.1. However I and the
world seem to upgrade our databases using this method, so that can't be it.

I don't know,  not sure if there are some slony tuning I should be making
due to the extra delay.

Tory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20121129/1cba379c/attachment.htm