Martijn van Oosterhout kleptog at gmail.com
Tue Jan 3 09:31:24 PST 2012
Hoi,

> What version of Slony are you using?
> What version of Postgresql are you using?

Sorry, I should have mentioned: slony-I 2.0.7, PostgreSQL 9.0.3

> I suspect you hit bug #167 (fixed in 2.1.0) that makes sync operations slow
> when sl_log_x gets big.

Well, the symptoms look about right. Might see if I can push to get an
upgrade to at least that version.

> unsubscribe(id=1, receiver=2);
> wait for event(origin=2, confirmed=all, wait on=2);
> subscribe set(set id=1, provider=1,receiver=2,forward=yes);
>
> In 2.1 and above slonik should automatically perform the 'wait for event'.

Ok, that's good to know. Currently we're doing:

slonik_unsubscribe_set -c /etc/slony_tools.conf 1 2 |slonik

which is quite possibly not the recommended way. But the output is
just an unsubscribe with a try .. on error .. block around it.

>> --- unsubscribe happens here ---
>> 2012-01-01 14:37:58 CETCONFIG unsubscribeSet: sub_set=1
>> 2012-01-01 14:37:58 CETCONFIG storeListen: li_origin=1 li_receiver=2
>> li_provider=1
>> --- this is the weird error ---
>> 2012-01-01 14:44:09 CETERROR  remoteWorkerThread_1: "update
>> "_pp_config_rep".sl_setsync set     ssy_seqno = '5000990613',
>> ssy_snapshot =  '20587190:20590885:20587190',     ssy_action_list = ''
>> where ssy_setid in (1) and ssy_seqno<  '5000990613'; " ERROR:  could
>> not serialize access due to concurrent update
>> 2012-01-01 14:44:09 CETERROR  remoteWorkerThread_1: SYNC aborted
>
> If the receivers remoteWorkerThread_1 is aborted because of the unsubscribe
> then I don't see this as an issue.

Given the timing I wonder if it's the subscribe that's actually failing...

>> 2012-01-01 14:44:21 CETCONFIG remoteWorkerThread_1: update provider
>> configuration
>> 2012-01-01 14:44:21 CETCONFIG remoteWorkerThread_1: helper thread for
>> provider 1 terminated
>> 2012-01-01 14:44:21 CETCONFIG remoteWorkerThread_1: disconnecting from
>> data provider 1
>> 2012-01-01 14:44:21 CETCONFIG storeSubscribe: sub_set=1 sub_provider=1
>> sub_forward='t'
>> 2012-01-01 14:44:21 CETCONFIG storeListen: li_origin=1 li_receiver=2
>> li_provider=1
>> 2012-01-01 14:44:21 CETCONFIG remoteWorkerThread_1: update provider
>> configuration
>> --- and here it appears slony thinks it is properly subscribed, but
>> the database has not been resynced ---
>
> What do you base this statement on?   I do not see anything in the logs you
> pasted showing me that slon2 has received the SUBSCRIBE_SET or
> ENABLE_SUBSCRIPTION event.

Well, I'm basing it on that the logs repeated this for two days:

2012-01-02 12:31:47 CETINFO   cleanupThread:    0.004 seconds for cleanupEvent()
NOTICE:  Slony-I: log switch to sl_log_2 complete - truncate sl_log_1
CONTEXT:  PL/pgSQL function "cleanupevent" line 99 at assignment
2012-01-02 12:43:15 CETINFO   cleanupThread:    0.079 seconds for cleanupEvent()
2012-01-02 12:43:15 CETINFO   cleanupThread:    0.006 seconds for vacuuming
NOTICE:  Slony-I: Logswitch to sl_log_1 initiated
CONTEXT:  SQL statement "SELECT "_pp_config_rep".logswitch_start()"
PL/pgSQL function "cleanupevent" line 101 at PERFORM
2012-01-02 12:54:30 CETINFO   cleanupThread:    0.004 seconds for cleanupEvent()
NOTICE:  Slony-I: log switch to sl_log_1 complete - truncate sl_log_2
CONTEXT:  PL/pgSQL function "cleanupevent" line 99 at assignment
2012-01-02 13:05:14 CETINFO   cleanupThread:    0.096 seconds for cleanupEvent()
NOTICE:  Slony-I: Logswitch to sl_log_2 initiated
CONTEXT:  SQL statement "SELECT "_pp_config_rep".logswitch_start()"

So it was doing something, but now you mention it, there are no SYNCs,
so perhaps it wasn't subscribed.

This is the log for the slave, and it says there "storeSubscribe".
doesn't that mean it received a SUBSCRIBE_SET?

> What is sl_subscribe on node 2 at this point in time?
> Does node 2 have a record in sl_event of it having processed those two
> events?

Unfortunately I don't have that information available, but it looks
like pilot error (the user claimed there were no errors) after all:
the subscribe didn't work.

Have a nice day,
-- 
Martijn van Oosterhout <kleptog at gmail.com> http://svana.org/kleptog/


More information about the Slony1-general mailing list