[Slony1-general] really inefficient queries with slony 1.1 and postgres 7.4

Fri Sep 2 20:07:44 PDT 2005

Michael Crozier wrote:

>>If there's any way of generating some replicable test scenarios that do
>>indeed exercise the above form of slow query, then it would be great to do
>>some testing with some extra indices to see which make a difference.
>>    
>>
>
>Would the optimizer even use tableid in an index for a query that uses 
>"tableid in (many tableid's)"?  Given that sets often have dozens of tables, 
>I doubt it.
>  
>
I agree, it seems unlikely that the table ID column would be of any use
in an index intended for improving search efficiency.

>Especially on my slower machine, combining the two "set parts" of the query 
>produce the best results**.  By combining the tableid filter statements to 
>include the list of the tables from all subscribed sets, query time is 
>reduced about 60%.  I think this indicates that  a significant amount of time 
>(probably CPU time) is spent filtering the rows for tableid's.  This would 
>also explain why the problem is greatest on my old Sun 420, but lessened on a 
>new Opteron, when the two have similar disk configurations.
>
>
>
>** When I say combining the "set parts", mean what was described earlier in 
>this thread, manually reducing the common substatements of the query in ways 
>that the optimizer doesn't seem able to:
>
>Instead of this:
>========================================================
> select [columns] from .sl_log_1
> where 
>   log_origin = <origin>
>   and
>     (   
>        (   log_tableid in (  [tableid's from set 1]  and  <xid filter> )
>        or
>        (   log_tableid in (  [tableid's from set 2]  and  <xid filter> )
>      )
>     ) order by log_actionseq;
>
>This:
>========================================================
> select [columns] from .sl_log_1
> where 
>   log_origin = <origin>
>   and  log_tableid in  [tableid's from sets 1 & 2]
>   and  <xid filter> 
> order by log_actionseq;
>  
>
The thing is, the second query /isn't/ equivalent to the first one.

I expect that the results of the queries will be the same, but in order
for it to be appropriate to actually change the query, we need to verify
that the results are "provably" the same.

I'll have to have a chat with Jan about that; 'tis on my list...

But in any case, it seems to me that we get the Big Gain from adding in
an index on the XID column.  If that gets us 80% of the improvement, it
may not be worth improving the query.