Christopher Browne cbbrowne at ca.afilias.info
Fri Nov 30 08:39:08 PST 2007
Brian Hirt <bhirt at mobygames.com> writes:
> I'm not really intimate with slony, but it seems like there should be
> a better way to drop a table from a set for situations like this.
> Maybe an option that clears pending updates from the log?

An "easy answer" would be to determine the table ID, and delete all
tuples relating to that table from sl_log_1 and sl_log_2.

Thus, ...

   delete from _mobycluster.sl_log_1 where log_tableid = 18;
   delete from _mobycluster.sl_log_2 where log_tableid = 18;

I would be inclined to run these two queries against *all* nodes, in
order to "nip this" completely.

That would indeed address your issue *very* quickly, as that would
drop out all replication data relating to "table 18."

How to apply that to "SET DROP TABLE" seems rather more controversial
to me...  I agree that you have described a "use case" for something
useful, but I wouldn't change existing functionality without a fair
bit more discussion.

There are essentially two ways we might have SET DROP TABLE work:

1.  As you suggest, where it is treated as an event to process as
early as possible, perhaps *before* other events that were submitted
before it, where the intent is to expunge the table from replication
without caring about the consistency of the contents of the table.

In effect, "Oops, this table's blown!  Let's drop it out of
replication ASAP, and we don't care what has or hasn't been replicated
in this table."

2.  The existing way that SET DROP TABLE works is consistent with
allowing it to be dropped out of replication in a consistent place
against all nodes.

The expectation is that the dropped table will contain identical data
across all subscribers at the time that it is dropped from
replication.

There are, indeed, uses for both of these approaches.

There is an item on my work list to create a "CANCEL SUBSCRIPTION"
command which would be somewhat similar to this.

-> Like your form of "SET DROP TABLE," it would be intended to be
   invoked with high priority, ASAP.

-> It would try to clean things up vis-a-vis the configuration
   of the subscription, but would not make any particular
   attempt to encourage consistency of data within the replicated
   tables.

Actually, I think I would want for the handling of something like what
you're asking for to be consistent with "CANCEL SUBSCRIPTION" in a
couple of ways:

  - It should use "CANCEL TABLE", and be entirely distinct from SET
    DROP TABLE

  - There might be further "CANCEL" commands; all of them should be
    consistent in the fashion that they are processed with high
    priority (e.g. - before all non-CANCEL events), and in that
    their immediacy overrides data consistency.

I think that we only really need "CANCEL SUBSCRIPTION," "CANCEL
TABLE," and, *maybe*[1], "CANCEL SEQUENCE."

Further thought suggests that there *are* more events that we should
consider processing with heightened priority.  It seems to me that
STORE PATH is quite likely to get used in an attempt to fix
configuration and that it may be preferable to make sure STORE PATH
gets run before any SYNC event goes through.

That points to a notion of evaluating our events to see which are
"high priority," and always processing them first.

Footnotes: 

[1] Maybe, because there's never more than 1 tuple to update, so that
it's not a terribly important event to be able to support at "high
priority."
-- 
select 'cbbrowne' || '@' || 'cbbrowne.com';
http://www3.sympatico.ca/cbbrowne/linuxxian.html
Real Programmers are surprised when the odometers in their cars don't
turn from 99999 to A0000.  (Competent ones, when it's not 9999A.)


More information about the Slony1-general mailing list