[Slony1-commit] By cbbrowne: Adding in a document to discuss "care and feeding" of

Wed Feb 2 23:09:03 PST 2005

Log Message:
-----------
Adding in a document to discuss "care and feeding" of embedding slonik
scripts

Modified Files:
--------------
    slony1-engine/doc/adminguide:
        slony.sgml (r1.10 -> r1.11)

Added Files:
-----------
    slony1-engine/doc/adminguide:
        usingslonik.sgml (r1.1)

-------------- next part --------------
Index: slony.sgml
===================================================================
RCS file: /usr/local/cvsroot/slony1/slony1-engine/doc/adminguide/slony.sgml,v
retrieving revision 1.10
retrieving revision 1.11
diff -Ldoc/adminguide/slony.sgml -Ldoc/adminguide/slony.sgml -u -w -r1.10 -r1.11

--- doc/adminguide/slony.sgml
+++ doc/adminguide/slony.sgml
@@ -66,6 +66,7 @@
 	<title><productname>Slony-I</productname> binaries</title>
 	&slon;
 	&slonik;
+        &usingslonik;
      </chapter>
      &slonikref;
   </part>
--- /dev/null
+++ doc/adminguide/usingslonik.sgml
@@ -0,0 +1,374 @@
+<!-- $Id: usingslonik.sgml,v 1.1 2005/02/02 23:08:50 cbbrowne Exp $ -->
+<sect1 id="usingslonik"> <title>Using Slonik</title>
+
+<para> It's a bit of a pain writing <application>Slonik</application>
+scripts by hand, particularly as you start working with
+<productname>Slony-I</productname> clusters that may be comprised of
+increasing numbers of nodes and sets.  Some problems that have been
+noticed include the following:
+
+<itemizedlist>
+
+<listitem><para> If you are using <productname>Slony-I</productname>
+as a <quote>master/slave</quote> replication system with one
+<quote>master</quote> node and one <quote>slave</quote> node, it may
+be sufficiently mnemonic to call the <quote>master</quote> node 1 and
+the <quote>slave</quote> node 2.
+
+<para> Unfortunately, as the number of nodes increases, the mapping of
+IDs to nodes becomes way less obvious, particularly if you have a
+cluster where the origin might shift from node to node over
+time.</para></listitem>
+
+<listitem><para> Similarly, if there is only one replication set, it's
+fine for that to be <quote>set 1,</quote> but if there are a
+multiplicity of sets, the numbering involved in using set numbers may
+grow decreasingly intuitive.</para></listitem>
+
+<listitem><para> People have observed that
+<application>Slonik</application> does not provide any notion of
+iteration.  It is common to want to create a set of similar <link
+linkend="stmtstorepath"> <command>STORE PATH</command</link> entries,
+since, in most cases, hosts will likely access a particular server via
+the same host name or IP address.</para></listitem>
+
+<listitem><para> Users seem interested in wrapping everything possible
+in <command>TRY</command> blocks, which is regrettably
+<emphasis>less</emphasis> useful than might be imagined...
+
+</itemizedlist>
+
+<para> These have assortedly pointed to requests for such enhancements
+as:
+
+<itemizedlist>
+<listitem><para> Named nodes, named sets</para>
+
+<para> Unfortunately, the use of naming nearly turns into a need for
+an <quote>ESP protocol</quote>, as <application>slonik</application>
+would need to <emphasis>start</emphasis> by determining the (possibly
+in flux) set of mappings between node names and node
+IDs. </para></listitem>
+
+<listitem><para> Looping and control constructs</para>
+
+<para> It seems to make little sense to create a fullscale parser as
+Yet Another Little Language grows into a rather larger one.  There are
+plenty of scripting languages out there that can be used to construct
+Slonik scripts; it is unattractive to force yet another one on people.
+</para></listitem>
+
+</itemizedlist>
+
+<para> There are several ways to work around these issues that have
+been seen <quote>in the wild</quote>:
+
+<itemizedlist>
+
+<listitem><para> Some sort of text rewriting system such as M4 may be
+used to map mnemonic object names onto the perhaps-less-intuitive
+numeric arrangement.
+
+<listitem><para> Embedding generation of slonik inside shell
+scripts</para>
+
+<para> The test bed found in the <filename>src/ducttape</filename>
+directory takes this approach.</para></listitem>
+
+<listitem><para> The <link linkend="altperl"> altperl admin scripts
+</link> use Perl code to generate Slonik scripts.</para>
+
+<para> You define the cluster as a set of Perl objects; each script
+walks through the Perl objects as needed to satisfy whatever it is
+supposed to do.  </para></listitem>
+
+</itemizedlist>
+
+<sect2><title> Using m4 to rewrite slonik scripts </title>
+
+<para> This needs to be prefaced with something of a warning, from the
+GNU M4 documentation:
+
+<warning><para> Some people found `m4' to be fairly addictive. They
+first use `m4' for simple problems, then take bigger and bigger
+challenges, learning how to write complex `m4' sets of macros along
+the way. Once really addicted, users pursue writing of sophisticated
+`m4' applications even to solve simple problems, devoting more time
+debugging their `m4' scripts than doing real work. Beware that `m4'
+may be dangerous for the health of compulsive
+programmers.</para></warning>
+
+<para> This being said, <application>m4</application> has three
+significant merits over other text rewriting systems (such as
+<application>cpp</application>, the C preprocessor):
+
+<itemizedlist>
+
+<listitem><para> Like slonik, m4 uses <quote>#</quote> to indicate
+comments, with the result that it may be quietly used to do rewrites
+on slonik scripts.
+
+<para> Using <application>cpp</application> would require changing
+over to use C or C++ style comments.
+
+<listitem><para> <application> m4 </application> is reasonably
+ubiquitous, being available in environments like
+<productname>Solaris</productname> and <productname>AIX</productname>
+even when they do not have compiler tools for C available.  Its
+presence is commonly mandated by the presence of
+<productname>Sendmail</productname>.
+</para></listitem>
+
+<listitem><para> A <emphasis>potential</emphasis> merit over
+<application>cpp</application> is that <application> m4</application>
+can do more than just rewrite symbols.  It has control structures, can
+store data in variables, and can loop.
+
+<para> Of course, down that road lies the addictions warned of above,
+as well as the complexity challenges of
+<filename>sendmail.cf</filename>.  As soon as you discover you need
+things like loops and variables, it is quite likely that you want to
+write a slonik generator in your favorite scripting language, whether
+that be Bourne Shell, Perl, or Tcl.  Fans of more esoteric languages like
+Icon, Snobol, or Scheme will have to fight their own battles to get
+those deemed to be reasonable choices for <quote>best
+practices.</quote>
+</itemizedlist>
+
+<sect3><title> An m4 example </title>
+
+<para> Without further ado, here is an example where you set up a
+central file, <filename> cluster.m4 </filename> containing some M4
+rewrite rules:
+<programlisting>
+define(`node_srvrds005', `1')
+define(`node_srvrds004', `4')
+define(`node_srvrds003', `3')
+define(`node_srvrds007', `78')
+define(`ds501', `501')
+</programlisting>
+
+<para> In view of those node name definitions, you may write a Slonik
+script to initialize the cluster as follows, <filename>setup_cluster.slonik</filename>:
+
+<programlisting>
+node node_srvrds005 admin conninfo 'dsn=foo';
+node node_srvrds004 admin conninfo 'dsn=bar';
+node node_srvrds003 admin conninfo 'dsn=foo';
+node node_srvrds007 admin conninfo 'dsn=foo';
+node ds501 admin conninfo 'dsn=foo';
+
+create cluster	info (id=node_srvrds005);
+store node (id=node_srvrds004, comment='Node on ds004', spool='f');
+store node (id=node_srvrds003, comment='Node on ds003', spool='f');
+store node (id=node_srvrds007, comment='Node on ds007', spool='t');
+store node (id=ds501, comment='Node on ds-501', spool='f');
+</programlisting>
+
+<para> You then run the rewrite rules on the script, thus:
+<para>
+<command> % m4 cluster.m4 setup_cluster.slonik </command>
+<para> And receive the following output:
+<programlisting>
+node 1 admin conninfo 'dsn=foo';
+node 4 admin conninfo 'dsn=bar';
+node 3 admin conninfo 'dsn=foo';
+node 78 admin conninfo 'dsn=foo';
+node 501 admin conninfo 'dsn=foo';
+
+create cluster  info (id=1);
+store node (id=4, comment='Node on ds004', spool='f');
+store node (id=3, comment='Node on ds003', spool='f');
+store node (id=78, comment='Node on ds007', spool='t');
+store node (id=501, comment='Node on ds-501', spool='f');
+</programlisting>
+
+<para> This makes no attempt to do anything <quote>smarter</quote>,
+such as to try to create the nodes via a loop that maps across a list
+of nodes.  As mentioned earlier, if you wish to do such things, it is
+highly preferable to do this by using a scripting language like the
+Bourne Shell or Perl.
+
+<sect2><title> Embedding Slonik in Shell Scripts </title>
+
+<para> As mentioned earlier, there are numerous
+<productname>Slony-I</productname> test scripts in
+<filename>src/ducttape</filename> that embed the generation of Slonik
+inside the shell script.
+
+<para> They mostly <emphasis> don't </emphasis> do this in a terribly
+sophisticated way.  Typically, they use the following sort of
+structure:
+
+<programlisting>
+DB1=slony_test1
+DB2=slony_test2
+slonik <<_EOF_
+	cluster name = T1;
+	node 1 admin conninfo = 'dbname=$DB1';
+	node 2 admin conninfo = 'dbname=$DB2';
+
+	try {
+		table add key (node id = 1, fully qualified name = 'public.history');
+	}
+	on error {
+		exit 1;
+	}
+
+	try {
+		create set (id = 1, origin = 1, comment = 'Set 1 - pgbench tables');
+		set add table (set id = 1, origin = 1,
+			id = 1, fully qualified name = 'public.accounts',
+			comment = 'Table accounts');
+		set add table (set id = 1, origin = 1,
+			id = 2, fully qualified name = 'public.branches',
+			comment = 'Table branches');
+		set add table (set id = 1, origin = 1,
+			id = 3, fully qualified name = 'public.tellers',
+			comment = 'Table tellers');
+		set add table (set id = 1, origin = 1,
+			id = 4, fully qualified name = 'public.history',
+			key = serial, comment = 'Table accounts');
+	}
+	on error {
+		exit 1;
+	}
+_EOF_
+</programlisting>
+
+<para> A more sophisticated approach might involve defining some
+common components, notably the <quote>preamble</quote> that consists
+of the <command><link linkend="clustername">CLUSTER
+NAME</link></command> <command><link linkend="admconninfo">ADMIN
+CONNINFO</link></command> commands that are common to every Slonik
+script, thus:
+<programlisting>
+CLUSTER=T1
+DB1=slony_test1
+DB2=slony_test2
+PREAMBLE="cluster name = $CLUSTER
+node 1 admin conninfo = 'dbname=$DB1';
+node 2 admin conninfo = 'dbname=$DB2';
+"
+</programlisting>
+
+<para> The <envar>PREAMBLE</envar> value could then be reused over and
+over again if the shell script invokes <command>slonik</command>
+multiple times. </para>
+
+<para> It also becomes simple to assign names to sets and nodes:
+
+<programlisting>
+origin=1
+subscriber=2
+mainset=1
+slonik <<_EOF_
+$PREAMBLE
+try {
+    table add key (node id = $origin, fully qualified name = 'public.history');
+} on error {
+    exit 1;
+}
+try {
+	create set (id = $mainset, origin = $origin, comment = 'Set $mainset - pgbench tables');
+	set add table (set id = $mainset, origin = $origin,
+		id = 1, fully qualified name = 'public.accounts',
+		comment = 'Table accounts');
+	set add table (set id = $mainset, origin = $origin,
+		id = 2, fully qualified name = 'public.branches',
+		comment = 'Table branches');
+	set add table (set id = $mainset, origin = $origin,
+		id = 3, fully qualified name = 'public.tellers',
+		comment = 'Table tellers');
+	set add table (set id = $mainset, origin = $origin,
+		id = 4, fully qualified name = 'public.history',
+		key = serial, comment = 'Table accounts');
+} on error {
+	exit 1;
+}
+_EOF_
+</programlisting>
+
+<para> The script might be further enhanced to loop through the list
+of tables as follows:
+
+<programlisting>
+# Basic configuration
+origin=1
+subscriber=2
+mainset=1
+# List of tables to replicate
+TABLES="accounts branches tellers history"
+ADDTABLES=""
+tnum=1
+for table in `echo $TABLES`; do
+  ADDTABLES="$ADDTABLES
+   set add table ($set id = $mainset, origin = $origin,
+   id = $tnum, fully qualified name = 'public.$table',
+   comment = 'Table $tname');
+"
+  let "tnum=tnum+1"
+done
+slonik <<_EOF_
+$PREAMBLE
+try {
+    table add key (node id = $origin, fully qualified name = 'public.history');
+} on error {
+    exit 1;
+}
+try {
+	create set (id = $mainset, origin = $origin, comment = 'Set $mainset - pgbench tables');
+$ADDTABLES
+} on error {
+	exit 1;
+}
+_EOF_
+</programlisting>
+
+<para> That is of somewhat dubious value if you only have 4 tables,
+but eliminating errors resulting from enumerating large lists of
+configuration by hand will make this pretty valuable for the larger
+examples you'll find in <quote>real life.</quote>
+
+<para> You can do even more sophisticated things than this if your
+scripting language supports things like:
+
+<itemizedlist>
+<listitem><para> <quote>Record</quote> data structures that allow assigning things in parallel
+<listitem><para> Functions, procedures, or subroutines, allowing you to implement
+useful functionality once, and then refer to it multiple times within a script
+<listitem><para> Some sort of <quote>module import</quote> system so that common functionality
+can be shared across many scripts
+</itemizedlist>
+
+<para> If you can depend on having <ulink
+url="http://www.gnu.org/software/bash/bash.html"> Bash</ulink>, <ulink
+url="http://www.zsh.org/"> zsh</ulink>, or <ulink
+url="http://www.kornshell.com/"> Korn shell</ulink> available, well,
+those are all shells with extensions supporting reasonably
+sophisticated data structures and module systems.  On Linux, Bash is
+fairly ubiquitous; on commercial <trademark/UNIX/, Korn shell is
+fairly ubiquitous; on BSD, <quote>sophisticated</quote> shells are an
+option rather than a default.</para>
+
+<para> At that point, it makes sense to start looking at other
+scripting languages, of which Perl is the most ubiquitous, being
+widely available on Linux, <trademark/UNIX/, and BSD.
+
+</sect1>
+<!-- Keep this comment at the end of the file
+Local variables:
+mode:sgml
+sgml-omittag:nil
+sgml-shorttag:t
+sgml-minimize-attributes:nil
+sgml-always-quote-attributes:t
+sgml-indent-step:1
+sgml-indent-data:t
+sgml-parent-document:nil
+sgml-default-dtd-file:"./reference.ced"
+sgml-exposed-tags:nil
+sgml-local-catalogs:("/usr/lib/sgml/catalog")
+sgml-local-ecat-files:nil
+End:
+-->