Defining Slony-I Replication Sets

7. Defining Slony-I Replication Sets

Defining the nodes indicated the shape of the cluster of database servers; it is now time to determine what data is to be copied between them. The groups of data that are copied are defined as "replication sets."

A replication set consists of the following:

  • Keys on tables that are to be replicated that have no suitable primary key

  • Tables that are to be replicated

  • Sequences that are to be replicated

7.1. Primary Keys

Slony-I needs to have a primary key or candidate thereof on each table that is replicated. PK values are used as the primary identifier for each tuple that is modified in the source system. Note that they can be composite keys composed of multiple NOT NULL columns; they don't need to consist of single fields. There are three ways that you can get Slony-I to use a primary key:

  • If the table has a formally identified primary key, SLONIK SET ADD TABLE can be used without any need to reference the primary key. Slony-I can automatically pick up that there is a primary key, and use it.

  • If the table hasn't got a primary key, but has some candidate primary key, that is, some index on a combination of fields that is both UNIQUE and NOT NULL, then you can specify that key, as shown in the following example.

    SET ADD TABLE (set id = 1, origin = 1, id = 42, 
                   full qualified name = 'public.this_table', 
                   key = 'this_by_that', 
         comment='this_table has this_by_that as a candidate primary key');

    However, once you have come this far, there is little reason not to just declare some suitable index to be a primary key, which requires that the columns involved are NOT NULL, and which will establish a unique index. Here is an example of this:

    DROP INDEX my_table_name_col1_col2_uniq_idx;
    ALTER TABLE my_table_name ADD PRIMARY KEY (col1, col2);

    If your application is not somehow referencing the index by name, the this should not lose you anything, and it gives you the clear design benefit that a primary key has been declared for the table.

    Notice that while you need to specify the namespace for the table, you must not specify the namespace for the key, as it infers the namespace from the table.

  • If the table hasn't even got a candidate primary key, you might ask Slony-I to provide one using SLONIK TABLE ADD KEY.

    Warning

    SLONIK TABLE ADD KEY was always considered a "kludge", at best, and as of version 2.0, it is considered such a misfeature that it is being removed.

It is not terribly important whether you pick a "true" primary key or a mere "candidate primary key;" it is, however, strongly recommended that you have one of those instead of having Slony-I populate the PK column for you. If you don't have a suitable primary key, that means that the table hasn't got any mechanism, from your application's standpoint, for keeping values unique. Slony-I may, therefore, introduce a new failure mode for your application, and this also implies that you had a way to enter confusing data into the database.

7.2. Grouping tables into sets

It will be vital to group tables together into a single set if those tables are related via foreign key constraints. If tables that are thus related are not replicated together, you'll find yourself in trouble if you switch the "master provider" from one node to another, and discover that the new "master" can't be updated properly because it is missing the contents of dependent tables.

There are also several reasons why you might not want to have all of the tables in one replication set:

  • The initial COPY_SET event for a large set leads to a long running transaction on the provider node. The FAQ outlines a number of problems that result from long running transactions that will injure system performance.

    If you can split such a large set into several smaller pieces, that will shorten the length of each of the transactions, lessening the degree of the "injury" to performance.

    Another issue comes up particularly frequently when replicating across a WAN; sometimes the network connection is a little bit unstable, such that there is a risk that a connection held open for several hours will lead to CONNECTION TIMEOUT. If that happens when 95% done copying a 50-table replication set consisting of 250GB of data, that could ruin your whole day. If the tables were, instead, associated with separate replication sets, that failure at the 95% point might only interrupt, temporarily, the copying of one of those tables.

    These "negative effects" tend to emerge when the database being subscribed to is many gigabytes in size and where it takes many hours or even days for the subscriber to complete the initial data copy. For relatively small databases, this shouldn't be a material factor.

  • Any time you invoke SLONIK EXECUTE SCRIPT, this requests a lock on every single table in the replication set, first, on the origin node, and then, as the event propagates to other nodes, to each subscriber node.

    There have been reports "in the field" of this leading to deadlocks such that the SLONIK EXECUTE SCRIPT request had to be submitted many times in order for it to actually complete successfully.

    The more tables you have in a set, the more tables need to be locked, and the greater the chances of deadlocks.

    By the same token, if a particular DDL script only needs to affect a couple of tables, you might use SLONIK SET MOVE TABLE to move them temporarily to a new replication set. By diminishing the number of locks needed, this should ease the ability to get the DDL change into place.

    There is a further discussion of locking which outlines where Slony-I requires locks likely to intrude on your applications.

7.3. The Pathology of Sequences

Each time a SYNC is processed, values are recorded for all of the sequences in the set. If there are a lot of sequences, this can cause sl_seqlog to grow rather large.

This points to an important difference between tables and sequences: if you add additional tables that do not see much/any activity, this does not add any material load to the work being done by replication. For a replicated sequence, values must regularly be propagated to subscribers. Consider the effects:

  • A replicated table that is never updated does not introduce much work to the system.

    If it is not updated, the trigger on the table on the origin never fires, and no entries are added to sl_log_1/sl_log_2. The table never appears in any of the further replication queries (e.g. in the FETCH 100 FROM LOG queries used to find replicatable data) as they only look for tables for which there are entries in sl_log_1/sl_log_2.

  • In contrast, a fixed amount of work is introduced to each SYNC by each sequence that is replicated.

    Replicate 300 sequence and 300 rows need to be added to sl_seqlog on a regular basis, at least, thru until the 2.0 branch, where updates are only applied when the value of a given sequence is seen to change.

    It is more than likely that if the value of a particular sequence hasn't changed since it was last checked, perhaps the same value need not be stored over and over; some thought needs to go into how to do that safely.

  • Bug #1226 indicates an error condition that can come up if you have a replication set that consists solely of sequences.

    This is documented more in the FAQ here; the long and short is that having a replication set consisting only of sequences is not a particularly good idea.