Christopher Browne cbbrowne
Mon May 9 16:05:25 PDT 2005
Hannu Krosing wrote:

>On P, 2005-05-08 at 18:03 -0400, Christopher Browne wrote:
>  
>
>>Here is a patch to remote_worker.c which comes very near to dropping
>>and restoring indices on tables during the COPY_SET event.
>>
>>The notion: Build a query (it's pretty big, but that cuts way down on
>>the C "string slinging") that calculates two things for each table:
>>
>>1.  A query to drop each PK constraint/other index
>>
>>2.  A query to restore each PK constraint/other index
>>
>>Do the "drops" just before copying all the data, and then generate the
>>indices from scratch afterwards, and this should improve performance
>>of COPY_SET quite a bit.
>>
>>It finds the queries, but I'm not yet sure why it doesn't work.
>>
>>On "pgbench" samples, it gripes, upon recreating the primary key
>>constraint, that there already is one.
>>
>>I'm seeing in logs that the PK _was_ just dropped, via the statement:
>>  "alter table public.accounts drop constraint accounts_pkey;
>>
>>It is quite unclear why it then breaks :-(.
>>    
>>
>
>Are you sure that the drop and restore are in the same transaction ?
>
>  
>
Quite sure.

COPY_SET runs as one transaction so that it either all occurs or all
rolls back as one change.

>Or maybe it is just this ?:
>
>hannu=# alter table a drop constraint a_pkey;
>NOTICE:  constraint $1 on table b depends on index a_pkey
>ERROR:  cannot drop constraint a_pkey on table a because other objects
>depend on it
>HINT:  Use DROP ... CASCADE to drop the dependent objects too.
>
>This is similar to why you cant use TRUNCATE and probably something that
>requires some new features in backend. 
>
>  
>
2005-05-09 10:13:42 EDT ERROR  remoteWorkerThread_11: " alter table
public.accounts add  PRIMARY KEY (aid);" PGRES_FATAL_ERROR ERROR: 
multiple primary keys for table "accounts" are not allowed

The problem isn't taking place at "drop constraint" time; it is
occurring when I try to recreate it.

Ah.  I think I see it now...

        res2 = PQexec(loc_dbconn, dstring_data(&indexregenquery));
        if (query_execute(node, loc_dbconn, &indexregenquery) < 0)
        {
            PQclear(res2);
            slon_disconnectdb(pro_conn);
            dstring_free(&indexregenquery);
            return -1;
        }

That executes the "regenerate indices" query twice, which would
presumably be the problem.  If I comment out the first line there,
everything's "green."

I have added a config variable (drop_indices) to allow the use of this
to be configured, and have added it in, with default being to use it. 

It is triply useful when TRUNCATE cannot be used in that:
 1.  It means the DELETE statement doesn't need to do any work on indices
 2.  It means the COPY statement doesn't touch the indices
 3.  Even if there are a lot of dead tuples in the table, only the live
ones wind up expressed in the indices that get regenerated.

There are probably some further "locality of reference" benefits in that
all the dead tuples in 3. will be at the physical start of the table
whereas the new, live tuples will mostly be in a somewhat sequential
order at the physical end of the table. 

It's just too bad that we can't do a nested VACUUM (7.3 doesn't support
that) to clean out the dead tuples at the start of this process...


More information about the Slony1-general mailing list