[Slony1-general] Strange behavior adding a new node, very, VERY slow

Tue Aug 5 01:31:57 PDT 2008

this is where it start getting weird,

No error anywhere in any log.

with

pg_stat_activity

I just got 3 transactions sitting in idle (to the three other nodes, 
Master, node2,3)

It is using exactly the same schema as i was using for node 2 (as its on 
the same machine so used the same one!)

and its not re-doing it, it is just being very very very slow, and about 
99% of the time its doing nothing...

its kind of freakish actually...

:/

Stéphane A. Schildknecht wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Martin Eriksson a écrit :
>   
>> Sorry,
>>
>> I should mention that this is Postgres 8.2.4, and Slony 1.2.14
>>
>> Martin Eriksson wrote:
>>     
>>> Hi everyone.
>>>
>>> I've been using slonly for a while now and feel pretty confident with
>>> what im doing but I can not understand what is going now!
>>>
>>> current setup:
>>> 1 Master
>>> 2 slave1 (provider = 1)
>>> 3 slave2 (provider = 1)
>>>
>>> adding a new node 4  (provider = 1)
>>>
>>> machines on same hardware, all machines are pretty nice machines, 8
>>> gigs of ram in each machine
>>> master got 6 gigs allocated to postgres, slave machines got 3.2 gigs
>>> allocated. all running ubuntu 64 bit
>>>
>>> database is a total of 7.9 gigs (including the slony schema, total
>>> data that need to be replicated around 3.5 gigs)
>>>
>>> master and slave 1 are sitting next to each other connected with a 1
>>> GB/s line on a separate interface.
>>>
>>> now node 4, I created a new postgres installation on slave 1 machine,
>>> running on different port same memory allocation (3.2 gigs) so total
>>> usage of memory on that machine by the two postgres servers is 6.4 gig
>>> (still 1.4 gig free)
>>>
>>> On saturday I did sync up node 2 from scratch and it toke a total of
>>> 20 minutes.
>>>
>>> Sunday afternoon database was put in production and being used, its
>>> not a overly used database around 18000, slony event per 24h with a
>>> total of 2000-3000 db commits on Master per 24h
>>>
>>> So yesterday morning I started to sync node 4, and now 22h later it is
>>> still running!!! and its only 1/3rd done!!!
>>>
>>> does anyone got a good explination for this?
>>>
>>> I look on the slave 2 machine, 0.2-0.4 load, memory is available, only
>>> using a fraction of the bandwidth, io-stats are down. It is more or
>>> less the same for the Master as low cpu load and low io load, and low
>>> bandwidth usage.
>>>
>>> looking on the db, it appear that its trying to do EVERYTHING in a
>>> single transaction as tables that have been copied are still showing
>>> up as count(*) = 0, is there a way to not do everything in a single
>>> transaction??
>>>
>>> or anyone got some other idea??
>>>
>>>       
>
> Do you have any error messages ?
> As you noticed, the first synchronisation is done in a sigle transaction.
> That's why any failure (network failure, schema not exactly the same on both
> nodes...) will interrupt replication and make it begin from scratch again and
> again.
>
> Further reading let me think it can't be a network trouble.
> How did you get the schema for that new slave ?
>
> A quick look at pg_stat_activity may tell you which table is been synchronized.
>
> Regards,
> - --
> Stéphane Schildknecht
> PostgreSQLFr : http://www.postgresql.fr
>
> Venez nous rencontrer le 4 octobre lors du plus important événement
> PostgreSQL francophone : http://www.pgday.fr
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFImAZaA+REPKWGI0ERAn1dAJ0VA5GY04W5Bl96pEk1GcuFHAkf2gCfQQdk
> y12rN2fShxthch5cMtJn5Ek=
> =qIRa
> -----END PGP SIGNATURE-----
> _______________________________________________
> Slony1-general mailing list
> Slony1-general at lists.slony.info
> http://lists.slony.info/mailman/listinfo/slony1-general
>