Michael Crozier crozierm
Fri Sep 2 01:57:18 PDT 2005
I was having this same problem this morning, after the completed copy of a XX 
million row table from a 7.3.8 database to a 7.4.8.  The copy took 20 hours 
and there were hundreds of thousands of transactions during that time.

As you describe (but far worse on my old hardware), queries of sl_log_1 were 
taking nearly ten minutes.  No vacuuming or analyzing made a difference and 
the queries were "de-optimized" into sequential scans.  In novice's 
desperation, I tried adding a new index on only the log_xid column.  The 
queries began using this index in favor of sequential scans and the 
subscriber is now catching up.

This solution makes little sense, but it certainly worked for my particular 
scenario.

Here is the planner's explanation, after the index:
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=363713.01..363752.42 rows=15762 width=255)
   Sort Key: log_actionseq
   ->  Index Scan using sl_log_1_idx2, sl_log_1_idx2 on sl_log_1  
(cost=0.00..362614.05 rows=15762 width=255)
         Index Cond: (((log_xid < '1750854484'::xxid) AND (log_xid >= 
'1750179272'::xxid)) OR ((log_xid < '1750854484'::xxid) AND (log_xid >= 
'1750179272'::xxid)))
         Filter: ((log_origin = 1) AND
                  (((log_tableid = 93) AND (log_xid < '1750854484'::xxid) AND 
xxid_lt_snapshot(log_xid,'1750179272:1750854484:1750854483,1750804298,1750179272,1750851104,1750433433,1750853878,1750854481'::xxid_snapshot) 
AND (log_xid >= '1750179272'::xxid) AND xxid_ge_snapshot(log_xid, 
'1750179272:1750854278:1750854275,1750854250,1750854274,1750804298,1750179272,1750851104,1750433433,1750853878,1750854276'::xxid_snapshot)) 
OR ((<many many many table id's>) AND (log_xid < '1750854484'::xxid) AND 
xxid_lt_snapshot(log_xid,'1750179272:1750854484:1750854483,1750804298,1750179272,1750851104,1750433433,1750853878,1750854481'::xxid_snapshot) 
AND (log_xid >= '1750179272'::xxid) AND xxid_ge_snapshot(log_xid, 
'1750179272:1750854278:1750854275,1750854250,1750854274,1750804298,1750179272,1750851104,1750433433,1750853878,1750854276'::xxid_snapshot))))
(5 rows)



> I have currently some performance problems, where slony generates a
> query that postgres 7.4.5 does not optimise well
>
> the query generated by slony when it replicates 4 sets from node 1 loocs
> like this (really it is in "declare cursor")
>
> select     log_origin, log_xid, log_tableid,     log_actionseq,
> log_cmdtype, log_cmddata
>   from "_bbb_cluster".sl_log_1
>  where log_origin = 1
> and (
> ( log_tableid in (3,9008,9007,9005,9004,2002,2001)
> and (log_xid < '1312955843'
>      and "_bbb_cluster".xxid_lt_snapshot(log_xid,
> '1312950014:1312955843:''1312955836'',''1312955783'',''1312955806'',''13129
>50014'',''1312952044'''))
>
> and (log_xid >= '1312942023'
>      and "_bbb_cluster".xxid_ge_snapshot(log_xid,
> '1312942023:1312947935:''1312947917'',''1312947924'',''1312942023'',''13129
>46242''')) )
> or
> ( log_tableid in
> (1002,1003,1013,1041,1037,1028,1026,1023,1031,1012,1048,1050,1046,1021,1019
>,1024,1027,1029,1025,1035,1011,1009,1010,1016,1032,1018,1030,1138)
>
> and (log_xid < '1312955843'
>      and "_bbb_cluster".xxid_lt_snapshot(log_xid,
> '1312950014:1312955843:''1312955836'',''1312955783'',''1312955806'',''13129
>50014'',''1312952044'''))
>
> and (log_xid >= '1312942023'
>      and "_bbb_cluster".xxid_ge_snapshot(log_xid,
> '1312942023:1312947935:''1312947917'',''1312947924'',''1312942023'',''13129
>46242''')) )
> or
> ( log_tableid in
> (7001,7008,7007,7004,7039,7002,7030,7018,7038,7003,7005,7006,7009,7011,7012
>,7013,7016,7021,
> 7022,7025,7026,7027,7028,7029,7031,7033,7034,7035,7036,7037,1075,9009,9011,
>9012,9013,9014,
> 9015,9016,9017,1051,1052,1053,1054,1055,1056,1057,1058,1059,1060,1061,1062,
>1063,1064,1065, 1066,1067,1068,1070,1071,1072,1073,1074,1076,1077,1078)
> and (log_xid < '1312955843'
>      and "_bbb_cluster".xxid_lt_snapshot(log_xid,
> '1312950014:1312955843:''1312955836'',''1312955783'',''1312955806'',''13129
>50014'',''1312952044''')) and (log_xid >= '1312942023'
>      and "_bbb_cluster".xxid_ge_snapshot(log_xid,
> '1312942023:1312947935:''1312947917'',''1312947924'',''1312942023'',''13129
>46242''')) )
> or
> ( log_tableid in
> (7051,7050,7052,7053,7054,7055,7056,7057,7058,7059,7060,7061,7062,7063,7064
>,7065,7066,7067,7068,7069,7070,7071,7072,7073,7074) and (log_xid <
> '1312955843'
>      and "_bbb_cluster".xxid_lt_snapshot(log_xid,
> '1312950014:1312955843:''1312955836'',''1312955783'',''1312955806'',''13129
>50014'',''1312952044''')) and (log_xid >= '1312942023'
>      and "_bbb_cluster".xxid_ge_snapshot(log_xid,
> '1312942023:1312947935:''1312947917'',''1312947924'',''1312942023'',''13129
>46242''')) ) )
> order by log_actionseq;
>
> And this is done by postgres a seqscan, fro which the first FETCH runs
> about 60 sec :(
>
> as everything other than "log_tableid in (n,n,n) is the same in all OR'd
> parts, the query  could actually be done as
>
> select     log_origin, log_xid, log_tableid,     log_actionseq,
> log_cmdtype, log_cmddata
>   from "_bbb_cluster".sl_log_1
>  where log_origin = 1
>    and (log_xid >= '1312942023' and "_bbb_cluster".xxid_ge_snapshot
> (log_xid,
> '1312942023:1312947935:''1312947917'',''1312947924'',''1312942023'',''13129
>46242''')) and (log_xid < '1312955843'      and
> "_bbb_cluster".xxid_lt_snapshot (log_xid,
> '1312950014:1312955843:''1312955836'',''1312955783'',''1312955806'',''13129
>50014'',''1312952044''')) and log_tableid in
> (3,9008,9007,9005,9004,2002,2001,1002,1003,1013,1041,1037,1028,1026,1023,10
>31,1012,1048,1050,1046,
> 1021,1019,1024,1027,1029,1025,1035,1011,1009,1010,1016,1032,1018,1030,1138,
>7001,7008,7007,7004,7039,
> 7002,7030,7018,7038,7003,7005,7006,7009,7011,7012,7013,7016,7021,7022,7025,
>7026,7027,7028,7029,7031,
> 7033,7034,7035,7036,7037,1075,9009,9011,9012,9013,9014,9015,9016,9017,1051,
>1052,1053,1054,1055,1056,
> 1057,1058,1059,1060,1061,1062,1063,1064,1065,1066,1067,1068,1070,1071,1072,
>1073,1074,1076,1077,1078,
> 7051,7050,7052,7053,7054,7055,7056,7057,7058,7059,7060,7061,7062,7063,7064,
>7065,7066,7067,7068,7069, 7070,7071,7072,7073,7074)
> order by log_actionseq;
>
> which is done as index scan and runs in 0.4 sec (as measured by explain
> analyse)
>
> So I'd propose that if there are several table sets subscribed from the
> same master, their id lists should be merged before generating the query
> instead of generating a similar query for each set and then OR'ing
> these.

-- 

Conducive Technology Corporation
 Taking air cargo information 
   to the next level

     http://www.conducivetech.com
      http://www.pathfinder-web.com

        Michael Crozier crozierm at conducivetech.com

          Voice: 503.445.4233
            Fax:   503.274.0939



More information about the Slony1-general mailing list