Log Shipping - Slony-I with Files

4.5. Log Shipping - Slony-I with Files

Slony-I has the ability to serialize the updates to go out into log files that can be kept in a spool directory.

The spool files can then be transferred via whatever means is desired to a "slave system," whether that be via FTP, rsync, or perhaps even by pushing them onto a 1GB "USB key" to be sent to the destination by clipping it to the ankle of some sort of "avian transport" system.

There are plenty of neat things you can do with a data stream in this form, including:

  • Replicating to nodes that aren't securable

  • Replicating to destinations where it is not possible to set up bidirection communications

  • If some disaster strikes, you can look at the logs of queries in detail

    This makes log shipping potentially useful even though you might not intend to actually create a log-shipped node.

  • The .SQL log files could be used to generate write activity for doing tests

  • You may apply triggers on the "disconnected node " to do additional processing on the data. For instance, you might take a fairly "stateful" database and turn it into a "temporal" one by use of triggers that implement the techniques described in [Developing Time-Oriented Database Applications in SQL ] by Richard T. Snodgrass.

4.5.1. Setting up Log Shipping

Setting up log shipping requires that you already have a replication cluster setup with at least two nodes and that the tables and sequences that you want to ship data for are part of replication sets that the subscriber is subscribed to.

  • Create your log shipping target database include your applications tables. This can be created from your applications DDL sources or with a pg_dump -s from a slave node. If you are using a pg_dump from a slave then you need to exclude the slony log triggers from this dump and possibly some of your application triggers. The script tools/find-triggers-to-deactivate.sh might help with this.

  • Stop the slon daemon for the slave node that you want to generate the log shipping files from. Log shipping files are generated by a slon daemon for a slave and include the changes that are applied to that slave for all sets that the slave is subscribed to

  • Run the script tools/slony1_dump.sh this script is a shell script that dumps the present state (data) of the subscriber node. The script requires psql to be in your PATH and that the PGHOST, PGDATABSE and PGPORT variables be set (if required) so that psql can connect to the database you are dumping from. The name of your slony cluster should be passed as a argument to slony1_dump.sh

  • The dump generated by slony1_dump.sh on standard-out should then be restored (via psql) on the log shipping target node. This will also create a set of slony log shipping tracking tables that the log shipping daemon will use to track which events have been processed on the log shipping target node.

  • Restart the slon daemon on the slave node and pass the -a option followed by directory that the log shipping files should be placed in. Alternatively you can specify a archive_dir option in your slon.conf file.

Slon will generate a file in the archive directory for each SYNC event that it processes. These files will contain COPY statements that insert data into the sl_log_archive table. A trigger, that was installed by slony1_dumps.sh will intercept these inserts and make the approriate changes to your replicated tables

Slon will generate log shipping files that contain data for all tables in any replication set that the node is subscribed for. There is no support in slon to only data for some of the replication sets in the log shipping files. If slon is unable to create the log shipping files, for example because you run out of disk space, then slon will stop processing new SYNC events and replication for that subscriber will fall behind.

4.5.2. Applying Log Files

The .SQL files that slon places in the archive directory contain SQL commands that should be executed on the log shipping target. These files need to be applied in the proper order. The file name of the next SQL file to apply is based on information contained in the sl_archive_tracking table. slony_logshipping is a daemon will monitor an archive directory and apply the updates in the proper order.

Each .SQL file contains a a SQL COPY command that will copy the data into the sl_log_archive table where a trigger will instead perform the proper action on the target database. The slony1_dump.sh script will create the sl_log_archive table and setup the trigger.

4.5.3. Converting SQL commands from COPY to INSERT/UPDATE/DELETE

Prior to Slony-I 2.2 the SQL files generated for log shipping contained INSERT/UPDATE/DELETE statements. As of Slony-I 2.2 the log shipping files contain COPY statements. The COPY statements should result in better performance. If you need the old style of SQL files with INSERT/UPDATE/DELETE then the script tools/logshipping_toinsert.pl can be used to convert the COPY style log shipping files to INSERT/UPDATE/DELETE statements. INSERT/UPDATE/DELETE statements should be easier to apply against databases other than PostgreSQL or in environments where you can't create the sl_log_archive table.

4.5.4. Cluster Reconfiguration Events

Most cluster reconfiguration events such as adding or dropping nodes or paths do not effect the nodes receiving the log files and are ignored by log shipping. However the following clsuter configuration events are supported.

  • UNSUBSCRIBE_SET is supported such that when the subscriber node generated the .SQL files is unsubscribed to a particular set then any new SQL files generated will not contain updates to the tables in that set.SET_DROP_TABLE, SET_DROP_SEQUENCE are handled in a similar fashion.

  • MOVE SET - I a sets origin is moved from another node to the node running generating the SQL files then changes for those tables will no longer be included in the SQL files generated because slon only includes changes in the .SQL files that the node is a subscriber for. A MOVE_SET that makes the subscriber an origin means that the changes to those tables will no longer be included in the SQL files generated

  • SUBSCRIBE_SET events are handled properly. This means that when the node generating the SQL files subscribes to a new set the data in the tables in that set will be included as part of the SQL files with a COPY command.