Peter Geoghegan peter.geoghegan86 at gmail.com
Thu Jul 30 04:26:56 PDT 2009
Sorry, but it seems I was wrong when I said that the problem was
fixed. I assumed that since the event lag returned to zero when new
paths where stored per Christopher's direction, the problem was
corrected. However, the problem persists.

When I restart a slave database (in the following example node 2),
replication works fine (at least as far as can be immediately
observed), but sl_status shows:

1;3;38689;"2009-07-30 12:11:51.796";38688;"2009-07-30
12:12:02.428316";"2009-07-30 12:11:41.859";1;"00:00:14.015"
1;2;38689;"2009-07-30 12:11:51.796";38605;"2009-07-30
11:52:35.119048";"2009-07-30 11:58:05.734";84;"00:13:50.14"

Node 2's event lag grows and grows (until the slon service is
restarted, at which time it returns to zero, just as before).

When I run test_slony_state-dbi.pl while the event lag continues to
grow, it outputs the following:

peter at peter-development-machine:~/slony1-2.0.2/tools>
./test_slony_state-dbi.pl --host=10.0.0.80 --database=lustre
--cluster=lustre_cluster --user=postgres --password=my_password
DSN: dbi:Pg:dbname=lustre;host=10.0.0.80;user=postgres;password=my_password;
===========================
Rummage for DSNs
=============================
Query:

   select p.pa_server, p.pa_conninfo
   from "_lustre_cluster".sl_path p
--   where exists (select * from "_lustre_cluster".sl_subscribe s where
--                          (s.sub_provider = p.pa_server or
s.sub_receiver = p.pa_server) and
--                          sub_active = 't')
   group by pa_server, pa_conninfo;


Tests for node 1 - DSN = dbi:Pg:dbname=lustre host=10.0.0.80
user=postgres password=my_password
========================================
pg_listener info:
Pages: 0
Tuples: 0

Size Tests
================================================
       sl_log_1         0  0.000000
       sl_log_2         0  0.000000
      sl_seqlog         0  0.000000

Listen Path Analysis
===================================================
No problems found with sl_listen

--------------------------------------------------------------------------------
Summary of event info
 Origin  Min SYNC  Max SYNC Min SYNC Age Max SYNC Age
================================================================================
      1     38605     38699     00:00:00     00:15:00    0
      2        20        20     01:08:00     01:08:00    1
      3        30        30     01:02:00     01:02:00    1


---------------------------------------------------------------------------------
Summary of sl_confirm aging
   Origin   Receiver   Min SYNC   Max SYNC  Age of latest SYNC  Age of
eldest SYNC
=================================================================================
        1          2      38605      38605      00:20:00      00:20:00    0
        1          3      38627      38698      00:00:00      00:11:00    0
        2          1         20         20      01:03:00      01:03:00    1
        2          3         20         20      01:02:00      01:02:00    1
        3          1         30         30      01:02:00      01:02:00    1
        3          2         30         30      01:08:00      01:08:00    1


------------------------------------------------------------------------------

Listing of old open connections on node 1
       Database             PID            User    Query Age
     Query
================================================================================


Tests for node 3 - DSN = dbi:Pg:dbname=lustre_slave host=10.0.0.82
user=postgres password=my_password
========================================
pg_listener info:
Pages: 0
Tuples: 0

Size Tests
================================================
       sl_log_1         0  0.000000
       sl_log_2         0  0.000000
      sl_seqlog         0  0.000000

Listen Path Analysis
===================================================
No problems found with sl_listen

--------------------------------------------------------------------------------
Summary of event info
 Origin  Min SYNC  Max SYNC Min SYNC Age Max SYNC Age
================================================================================
      1     38605     38699     00:00:00     00:15:00    0
      2        20        20     01:08:00     01:08:00    1
      3        30        30     01:02:00     01:02:00    1


---------------------------------------------------------------------------------
Summary of sl_confirm aging
   Origin   Receiver   Min SYNC   Max SYNC  Age of latest SYNC  Age of
eldest SYNC
=================================================================================
        1          2      38605      38605      00:21:00      00:21:00    0
        1          3      38629      38699      00:00:00      00:11:00    0
        2          1         20         20      01:03:00      01:03:00    1
        2          3         20         20      01:03:00      01:03:00    1
        3          1         30         30      01:03:00      01:03:00    1
        3          2         30         30      01:08:00      01:08:00    1


------------------------------------------------------------------------------

Listing of old open connections on node 3
       Database             PID            User    Query Age
     Query
================================================================================


Tests for node 2 - DSN = dbi:Pg:dbname=lustre_slave host=10.0.0.81
user=postgres password=my_password
========================================
pg_listener info:
Pages: 0
Tuples: 0

Size Tests
================================================
       sl_log_1         0  0.000000
       sl_log_2         0  0.000000
      sl_seqlog         0  0.000000

Listen Path Analysis
===================================================
No problems found with sl_listen

--------------------------------------------------------------------------------
Summary of event info
 Origin  Min SYNC  Max SYNC Min SYNC Age Max SYNC Age
================================================================================
      1     38573     38699    -00:05:00     00:15:00    0
      2        20        21     00:15:00     01:03:00    0
      3        30        30     00:57:00     00:57:00    1


---------------------------------------------------------------------------------
Summary of sl_confirm aging
   Origin   Receiver   Min SYNC   Max SYNC  Age of latest SYNC  Age of
eldest SYNC
=================================================================================
        1          2      38607      38699      00:00:00      00:15:00    0
        1          3      38573      38698     -00:05:00      00:15:00    0
        2          1         20         20      00:57:00      00:57:00    1
        2          3         20         20      00:57:00      00:57:00    1
        3          1         30         30      00:57:00      00:57:00    1
        3          2         30         30      01:02:00      01:02:00    1


------------------------------------------------------------------------------

Listing of old open connections on node 2
       Database             PID            User    Query Age
     Query
================================================================================

peter at peter-development-machine:~/slony1-2.0.2/tools>

Any further help you could offer is greatly appreciated,

Regards,
Peter Geoghegan


More information about the Slony1-general mailing list