Using Slony-I for PostgreSQL Upgrades

24. Using Slony-I for PostgreSQL Upgrades

A number of people have found Slony-I useful for helping perform upgrades between major PostgreSQL releases ( e.g. which mandates running initdb to create a new database instance) without requiring a substantial downtime.

The "simple" way that one might imagine doing such an upgrade would involve running pg_dump on the database running the older version, and then submit the results into a psql session connected to a database instance running the newer version. Unfortunately, the time consumed from start to end, for this approach, may be prohibitive. For a database containing 40GB of data with numerous indices, the process required involves:

  • Stop all applications that might modify the data

  • Start the pg_dump, and load that into the new database

  • Wait 40 hours for the dump/load to complete

  • Point "write" applications to the new database

And note that this approach led to a 40 hour outage.

Slony-I offers an opportunity to replace that long outage with one as little as a few seconds long. The approach required is to create a Slony-I replica in the new version. It is possible that it may take considerably longer than 40h to create that replica, however, establishing that replica requires no outage, and once it's there, it can be kept very nearly up to date.

When it comes time to switch over to the new database, the portion of the procedure that requires an application "outage" is a lot less time consuming:

  • Stop all applications that might modify the data

  • Lock the set against client application updates using SLONIK LOCK SET

  • Submit the Slonik command SLONIK MOVE SET to shift the origin from the old database to the new one

  • Point the applications to the new database

This procedure should only need to take a very short time, likely based more on how much time is required to reconfigure your applications than anything else. If you can automate all of these steps, the outage may conceivably be a second or less. If manual handling is necessary, then it is likely to take somewhere between a few seconds and a few minutes.

Note that after the origin has been shifted, updates are replicated back into the old database. If you discover that due to some unforeseen, untested condition, your application is somehow unhappy connecting to the new database, you may readily use SLONIK MOVE SET again to reverse the process to shift the origin back to the old database.

If you consider it particularly vital to be able to shift back to the old database in its state at the time of the changeover, so as to be able to outright roll back the whole change, and would like as well the ability to shift back to the old version (with any updates since the changeover), the following steps would accomplish that:

  • Prepare two Slony-I replicas of the database:

    • One running the new version of PostgreSQL

    • One running the old version of PostgreSQL

    Thus, you have three nodes, one running the new version of PostgreSQL, and the other two the old version.

    Note that this imposes a need to have Slony-I built against both databases (e.g. - at the very least, the binaries for the stored procedures need to have been compiled against both versions of PostgreSQL).

  • Once they are roughly "in sync", stop all applications that might modify the data

  • Allow them to get in sync, then stop the slon daemon that has been feeding the subscriber running the old version of PostgreSQL

    You may want to use SLONIK UNINSTALL NODE to decommission this node, making it into a standalone database, or merely kill the slon, depending on how permanent you want this all to be.

  • Then use SLONIK MOVE SET to shift the origin, as before.

Supposing a "small" disaster strikes, you might recover back to the node running the old database that has been seeing updates come through; if you run into larger problems, you would have to abandon the two nodes and go to the one which had been shut off.

This isn't to say that it is routine to have the sorts of problems that would mandate such a "paranoid" procedure; people worried about process risk assessments can be reassured if you have choices like this.

Note: Slony-I does not support versions of PostgreSQL older than 7.3.3 because it needs namespace support that did not solidify until that time. Rod Taylor "hacked up" a version of Slony-I to work on 7.2 by allowing the Slony-I objects to live in the global schema. He found it pretty fiddly, and that some queries weren't very efficient (the PostgreSQL query optimizer has improved considerably since 7.2), but that this was more workable for him than other replication systems such as eRServer. PostgreSQL 7.2 will never be supported by any official Slony-I release.

24.1. Example: Upgrading a single database with no existing replication

This example shows names, IP addresses, ports, etc to describe in detail what is going on

24.1.1. The Environment

		Database machine:
			name = rome 
			ip = 192.168.1.23
			OS: Ubuntu 6.06 LTS
			postgres user = postgres, group postgres
			
		Current PostgreSQL 
			Version = 8.2.3 
			Port 5432
			Installed at: /data/pgsql-8.2.3
			Data directory: /data/pgsql-8.2.3/data
			Database to be moved: mydb
			
		New PostgreSQL installation
			Version = 8.3.3
			Port 5433
			Installed at: /data/pgsql-8.3.3
			Data directory: /data/pgsql-8.3.3/data
			
		Slony Version to be used = 1.2.14
    

24.1.2. Installing Slony-I

How to install Slony-I is covered quite well in other parts of the documentation (Section 4); we will just provide a quick guide here.

       wget http://main.slony.info/downloads/1.2/source/slony1-1.2.14.tar.bz2
      

Unpack and build as root with

		tar xjf slony1-1.2.14.tar.bz2
		cd slony1-1.2.14
		./configure --prefix=/data/pgsql-8.2.3 --with-perltools=/data/pgsql-8.2.3/slony --with-pgconfigdir=/data/pgsql-8.2.3/bin
		make clean
		make
		make install
		chown -R postgres:postgres /data/pgsq-8.2.3 
		mkdir /var/log/slony
		chown -R postgres:postgres /var/log/slony
      

Then repeat this for the 8.3.3 build. A very important step is the make clean; it is not so important the first time, but when building the second time, it is essential to clean out the old binaries, otherwise the binaries will not match the PostgreSQL 8.3.3 build with the result that Slony-I will not work there.

24.1.3. Creating the slon_tools.conf

The slon_tools.conf is the configuration file. It contain all all the configuration information such as:

  1. All the nodes and their details (IPs, ports, db, user, password)

  2. All the tables to be replicated

  3. All the sequences to be replicated

  4. How the tables and sequences are arranged in sets

Make a copy of /data/pgsql-8.2.3/etc/slon_tools.conf-sample to slon_tools.conf and open it. The comments in this file are fairly self explanatory. Since this is a one time replication you will generally not need to split into multiple sets. On a production machine running with 500 tables and 100 sequences, putting them all in a single set has worked fine.

  1. In our case we only need 2 nodes so delete the add_node for 3 and 4.

  2. pkeyedtables entry need to be updated with your tables that have a primary key. If your tables are spread across multiple schemas, then you need to qualify the table name with the schema (schema.tablename)

  3. keyedtables entries need to be updated with any tables that match the comment (with good schema design, there should not be any).

  4. serialtables (if you have any; as it says, it is wise to avoid this).

  5. sequences needs to be updated with your sequences.

  6. Remove the whole set2 entry (as we are only using set1)

This is what it look like with all comments stripped out:

$CLUSTER_NAME = 'replication';
$LOGDIR = '/var/log/slony';
$MASTERNODE = 1;

    add_node(node     => 1,
	     host     => 'rome',
	     dbname   => 'mydb',
	     port     => 5432,
	     user     => 'postgres',
         password => '');

    add_node(node     => 2,
	     host     => 'rome',
	     dbname   => 'mydb',
	     port     => 5433,
	     user     => 'postgres',
         password => '');

$SLONY_SETS = {
    "set1" => {
	"set_id" => 1,
	"table_id"    => 1,
	"sequence_id" => 1,
        "pkeyedtables" => [
			   'mytable1',
			   'mytable2',
			   'otherschema.mytable3',
			   'otherschema.mytable4',
			   'otherschema.mytable5',
			   'mytable6',
			   'mytable7',
			   'mytable8',
			   ],

		"sequences" => [
			   'mytable1_sequence1',
   			   'mytable1_sequence2',
			   'otherschema.mytable3_sequence1',
   			   'mytable6_sequence1',
   			   'mytable7_sequence1',
   			   'mytable7_sequence2',
			],
    },

};

1;
      

As can be seen this database is pretty small with only 8 tables and 6 sequences. Now copy your slon_tools.conf into /data/pgsql-8.2.3/etc/ and /data/pgsql-8.3.3/etc/

24.1.4. Preparing the new PostgreSQL instance

You now have a fresh second instance of PostgreSQL running on port 5433 on the same machine. Now is time to prepare to receive Slony-I replication data.

  1. Slony does not replicate roles, so first create all the users on the new instance so it is identical in terms of roles/groups

  2. Create your db in the same encoding as original db, in my case UTF8 /data/pgsql-8.3.3/bin/createdb -E UNICODE -p5433 mydb

  3. Slony-I replicates data, not schemas, so take a dump of your schema /data/pgsql-8.2.3/bin/pg_dump -s mydb > /tmp/mydb.schema and then import it on the new instance. cat /tmp/mydb.schema | /data/pgsql-8.3.3/bin/psql -p5433 mydb

The new database is now ready to start receiving replication data

24.1.5. Initiating Slony-I Replication

This is the point where we start changing your current production database by adding a new schema to it that contains all the Slony-I replication information

The first thing to do is to initialize the Slony-I schema. Do the following as, in the example, the postgres user.

Note: All commands starting with slonik does not do anything themself they only generate command output that can be interpreted by the slonik binary. So issuing any of the scripts starting with slonik_ will not do anything to your database. Also by default the slonik_ scripts will look for your slon_tools.conf in your etc directory of the postgresSQL directory. In my case /data/pgsql-8.x.x/etc depending on which you are working on.

/data/pgsql-8.2.3/slony/slonik_init_cluster > /tmp/init.txt

open /tmp/init.txt and it should look like something like this

# INIT CLUSTER
cluster name = replication;
 node 1 admin conninfo='host=rome dbname=mydb user=postgres port=5432';
 node 2 admin conninfo='host=rome dbname=mydb user=postgres port=5433';
  init cluster (id = 1, comment = 'Node 1 - mydb@rome');

# STORE NODE
  store node (id = 2, event node = 1, comment = 'Node 2 - mydb@rome');
  echo 'Set up replication nodes';

# STORE PATH
  echo 'Next: configure paths for each node/origin';
  store path (server = 1, client = 2, conninfo = 'host=rome dbname=mydb user=postgres port=5432');
  store path (server = 2, client = 1, conninfo = 'host=rome dbname=mydb user=postgres port=5433');
  echo 'Replication nodes prepared';
  echo 'Please start a slon replication daemon for each node';
     
    

The first section indicates node information and the initialization of the cluster, then it adds the second node to the cluster and finally stores communications paths for both nodes in the slony schema.

Now is time to execute the command: cat /tmp/init.txt | /data/pgsql-8.2.3/bin/slonik

This will run pretty quickly and give you some output to indicate success.

If things do fail, the most likely reasons would be database permissions, pg_hba.conf settings, or typos in slon_tools.conf. Look over your problem and solve it. If slony schemas were created but it still failed you can issue the script slonik_uninstall_nodes to clean things up. In the worst case you may connect to each database and issue drop schema _replication cascade; to clean up.

24.1.6. The slon daemon

As the result from the last command told us, we should now be starting a slon replication daemon for each node! The slon daemon is what makes the replication work. All transfers and all work is done by the slon daemon. One is needed for each node. So in our case we need one for the 8.2.3 installation and one for the 8.3.3.

to start one for 8.2.3 you would do: /data/pgsql-8.2.3/slony/slon_start 1 --nowatchdog This would start the daemon for node 1, the --nowatchdog since we are running a very small replication we do not need any watchdogs that keep an eye on the slon process if it stays up etc.

if it says started successfully have a look in the log file at /var/log/slony/slony1/node1/ It will show that the process was started ok

We need to start one for 8.3.3 as well. /data/pgsql-8.3.3/slony/slon_start 2 --nowatchdog

If it says it started successfully have a look in the log file at /var/log/slony/slony1/node2/ It will show that the process was started ok

24.1.7. Adding the replication set

We now need to let the slon replication know which tables and sequences it is to replicate. We need to create the set.

Issue the following: /data/pgsql-8.2.3/slony/slonik_create_set set1 > /tmp/createset.txt

/tmp/createset.txt may be quite lengthy depending on how many tables; in any case, take a quick look and it should make sense as it defines all the tables and sequences to be replicated

If you are happy with the result send the file to the slonik for execution cat /tmp/createset.txt | /data/pgsql-8.2.3/bin/slonik You will see quite a lot rolling by, one entry for each table.

You now have defined what is to be replicated

24.1.8. Subscribing all the data

The final step is to get all the data onto the new database. It is simply done using the subscribe script. data/pgsql-8.2.3/slony/slonik_subscribe_set 1 2 > /tmp/subscribe.txt the first is the ID of the set, second is which node that is to subscribe.

will look something like this:

 cluster name = replication;
 node 1 admin conninfo='host=rome dbname=mydb user=postgres port=5432';
 node 2 admin conninfo='host=rome dbname=mydb user=postgres port=5433';
  try {
    subscribe set (id = 1, provider = 1, receiver = 2, forward = yes);
  }
  on error {
    exit 1;
  }
  echo 'Subscribed nodes to set 1';
     

send it to the slonik cat /tmp/subscribe.txt | /data/pgsql-8.2.3/bin/slonik

The replication will now start. It will copy everything in tables/sequneces that were in the set. understandable this can take quite some time, all depending on the size of db and power of the machine.

One way to keep track of the progress would be to do the following: tail -f /var/log/slony/slony1/node2/log | grep -i copy The slony logging is pretty verbose and doing it this way will let you know how the copying is going. At some point it will say "copy completed sucessfully in xxx seconds" when you do get this it is done!

Once this is done it will start trying to catch up with all data that has come in since the replication was started. You can easily view the progress of this in the database. Go to the master database, in the replication schema there is a view called sl_status. It is pretty self explanatory. The field of most interest is the "st_lag_num_events" this declare how many slony events behind the node is. 0 is best. but it all depends how active your db is. The field next to it st_lag_time is an estimation how much in time it is lagging behind. Take this with a grain of salt. The actual events is a more accurate messure of lag.

You now have a fully replicated database

24.1.9. Switching over

Our database is fully replicated and its keeping up. There are few different options for doing the actual switch over it all depends on how much time you got to work with, down time vs. data loss ratio. the most brute force fast way of doing it would be

  1. First modify the postgresql.conf file for the 8.3.3 to use port 5432 so that it is ready for the restart

  2. From this point you will have down time. shutdown the 8.2.3 postgreSQL installation

  3. restart the 8.3.3 postgreSQL installation. It should come up ok.

  4. drop all the slony stuff from the 8.3.3 installation login psql to the 8.3.3 and issue drop schema _replication cascade;

You have now upgraded to 8.3.3 with, hopefully, minimal down time. This procedure represents roughly the simplest way to do this.