[Slony1-general] data copy for set 1 failed 3 times

Mon Dec 10 15:26:57 PST 2012

On Mon, Dec 10, 2012 at 2:55 PM, Jan Wieck <JanWieck at yahoo.com> wrote:

> On 12/10/2012 5:28 PM, Tory M Blue wrote:
>
>>
>> I'm back to it with debug of 4 on the source and the destination nodes.
>>
>> Still failing
>>
>> Destination:
>>
>> Slon log:
>>
>> 2012-12-10 12:04:49 PST CONFIG remoteWorkerThread_1: 6525.754 seconds to
>> copy table "cls"."listings"
>> 2012-12-10 12:04:49 PST CONFIG remoteWorkerThread_1: copy table
>> "cls"."customers"
>> 2012-12-10 12:04:49 PST CONFIG remoteWorkerThread_1: Begin COPY of table
>> "cls"."customers"
>> 2012-12-10 12:04:49 PST ERROR  remoteWorkerThread_1: "select
>> "_admissioncls".copyFields(8);**"
>> 2012-12-10 12:04:49 PST WARN   remoteWorkerThread_1: data copy for set 1
>> failed 1 times - sleep 15 seconds
>> 2012-12-10 12:04:51 PST INFO   cleanupThread: 5961.364 seconds for
>> cleanupEvent()
>> 2012-12-10 12:05:06 PST INFO   copy_set 1 - omit=f - bool=0
>> 2012-12-10 12:05:06 PST INFO   omit is FALSE
>> 2012-12-10 12:05:06 PST CONFIG version for "dbname=clsdb host=server
>> user=postgres password=SECURED is 90104
>> 2012-12-10 12:05:07 PST DEBUG1 copy_set_1 "dbname=clsdb host=server
>> user=postgres password=SECURED": backend pid = 17092
>> 2012-12-10 12:05:07 PST CONFIG remoteWorkerThread_1: connected to
>> provider DB
>>
>> Postgres logs:
>>
>> 2012-12-10 12:04:51 PST admissionclsdb postgres [local] NOTICE:
>> Slony-I: Logswitch to sl_log_2 initiated
>> 2012-12-10 12:04:51 PST admissionclsdb postgres [local] CONTEXT:  SQL
>> statement "SELECT "_admissioncls".logswitch_**start()"
>> 2012-12-10 12:05:12 PST admissionclsdb postgres [local] LOG:  sending
>> cancel to blocking autovacuum PID 18620
>> 2012-12-10 12:05:12 PST admissionclsdb postgres [local] DETAIL:  Process
>> 18299 waits for AccessExclusiveLock on relation 17410 of database 16385.
>> 2012-12-10 12:05:12 PST admissionclsdb postgres [local] STATEMENT:  lock
>> table "cls"."listings";
>> 2012-12-10 12:05:12 PST    ERROR:  canceling autovacuum task
>> 2012-12-10 12:05:12 PST    CONTEXT:  automatic vacuum of table
>> "admissionclsdb.cls.listings"
>> 2012-12-10 12:05:12 PST admissionclsdb postgres [local] NOTICE:
>> truncate of "cls"."autobodystyle" succeeded
>>
>
> What is the output of
>
>     select > "_admissioncls".copyFields(8);
>
> on node 1? It is important to do this on node 1 because that is where the
> remote worker, processing the copy_set, is doing this.
>
> On node 1? In my configuration that's the primary master, which this node
is not talking to. This node is talking to node 2 (our slave).

1->2 -> 10 (this offsite node)
1->3/4

But node 1

admissionclsdb=# select "_admissioncls".copyFields(8);

copyfields

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
---------
 ("cust_seq_id","customer_id","customer_name","disabled","account","defa
ult_status","impression_credits","cost_fractional_cpm","acct_list","pricing_type
","pricing_model","subscription_price","monthly_price","monthly_minimum","accoun
t_type")
(1 row)

Node 2:
admissionclsdb=# select "_admissioncls".copyFields(8);

copyfields

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
---------
 ("cust_seq_id","customer_id","customer_name","disabled","account","defa
ult_status","impression_credits","cost_fractional_cpm","acct_list","pricing_type
","pricing_model","subscription_price","monthly_price","monthly_minimum","accoun
t_type")
(1 row)

And note, on my larger table when I try to skip set 1 and just go to set 2,
that has a table that is 4x as large and the resulting  field number is 4,
vs the 8. So it's not this particular table. I can sync set 3 just fine
(but it's small, finishes within 30 minutes).

>>
>> Source:
>>
>> There is nothing in the slon log that would tell me it's aware of this
>> client side restart or failure. Nothing in the postgres logs about an
>> EOF or anything.
>> Nothing in postgres log either that says there is an issue.
>>
>
> Please specify which postgres log you are talking about. There are at
> least two in this case, the one on the new node and the one on the data
> provider (node 1). The one from the data provider is where to expect any
> error messages.

Sorry I was trying to be clear, with destination and source.

Source is node 2 , nothing in todays main postgresql log nor in the slon
log.
Destination is node 10, nothing in the postgresql log and I've posted the
bits from the slon log

Thanks for not giving up on me
Tory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20121210/8b5ad45d/attachment-0001.htm