[Slony1-general] Per-row origin with SlonyI?

Tue Nov 2 01:36:54 PST 2004

Greetings.

SlonyI is great and I like where it's headed with SlonyII but I need
some multi-master support now.

Basically I would like the ability for a large replicated table to
have distributed origins where each row has a specific origin which is
allowed to modify the row and propagates changes to all slaves.  Each
row only needs to have a single origin/master however (i.e. I don't need
full multi-master support).

I'm hoping some of the developers can comment on

(a) How difficult would it be for me to add this to SlonyI?  Is it even
    feasible?

(b) If I were to produce a patch that implements this would it be
    considered for inclusion into SlonyI at all or do you intend to
    restrict these types of features to SlonyII.

Here's more detail on what I need to accomplish:

1) I have a small set of large tables representing user data.

2) There are up-to a dozen or so databases holding the data and they may
   be far apart (in terms of network latency and bandwidth).

3) Applications (near the databases but far apart from each other) will
   typically have a stable working set of user data they reference and
   update.

4) Applications need to be able to see any/all users data at any time.

5) User data is independent of all other user data

6) The method to determine a good location for a given row is static
   and simple (e.g. something as simple as
   pkey_id % number_of_nodes == slony_node)

To solve the read part of 4 I plan to replicate the data to all
databases using SlonyI.  However, writing all data to a single master
would be a severe bottle neck.

I've thought of solutions were the actual tables are split into multiple
tables and the parts are in different slony replication sets with
different origins.  There would also be a view (or a postgres base
table) to allow the application to ignore this and still see a single
table.  The view (or base table) would have a trigger that redirects
update/delete/inserts to the local-origin sub-table or returned an error
depending on the row.  In case of error the application would use the
error information to directly connect to the correct remote database.

This gets ugly to manage pretty quick, however, with all the sub-tables
and slony sets.

I'd instead like to extend SlonyI to directly support the above.
Instead of restricting all changes to slave tables it would instead
determine if it's the origin for the row.  This could be as simple as
a hash function on the primary key.  Each node is the origin for some
sub-set or rows and logs and feeds these changes to all other nodes.
Since the user data doesn't have any references between data for more
than one user this should work.  The primary key (or whatever columns
are used to determine the origin node) would not be allowed to change
and only the origin node for that primary key (or whatever columns)
would allow local changes.

What do you think, both about the proposed SlonyI changes (which I'd
make) or about other possible solutions to solve my problem?

Thanks,
-- 
Dave Chapeskie
OpenPGP Key ID: 0x3D2B6B34