Christopher Browne cbbrowne
Tue Jan 11 21:11:16 PST 2005
James Black wrote:

> I have to admit to a feeling of helplessness; I'm not sure even where  
> to begin investigating why we have 400,000 rows in sl_log_1, nor how 
> to  ascertain if some of those rows have been orphaned.

Grep for "clean" in the logs; that'll give you some idea of what the 
cleanup thread is doing, at least to the point of knowing generally how 
much work it's doing.

Here are some relevant queries from the cleanup thread

slonydb=# select ev_origin, ev_seqno, ev_minxid from sl_event where  
(ev_origin, ev_seqno) in   (select ev_origin, min(ev_seqno)  from 
sl_event    where ev_type = 'SYNC'    group by ev_origin);
 ev_origin | ev_seqno | ev_minxid 
-----------+----------+------------
        78 |     7269 | 31038690
         4 |    19865 | 302790144
         1 |  4499099 | 1622648266
         3 |    21098 | 470999377
       501 |    25805 | 83296945
(5 rows)

What then gets purged from sl_log_1 is based on based on that ev_minxid 
value.

The specific query:
                     "delete from %s.sl_log_1 "
                     "where log_origin = '%s' "
                     "and log_xid < '%s'; "

What would be sweet would be to peek at the logs and see how that 
compares; alas, this doesn't work directly :-(.

slonydb=# select log_origin, min(log_xid), max(log_xid), count(*) from 
sl_log_1 group by log_origin;
ERROR:  function min(xxid) does not exist

You might try for limited information:
slonydb=#  select log_origin, log_xid from sl_log_1 order by log_origin, 
log_xid limit 5;

 log_origin |  log_xid  
------------+------------
          1 | 1622706537
          1 | 1622706537
          1 | 1622706537
          1 | 1622706537
          1 | 1622706537
(5 rows)

slonydb=#  select log_origin, log_xid from sl_log_1 order by log_origin 
desc, log_xid desc limit 5;
 log_origin |  log_xid  
------------+------------
          1 | 1622831823
          1 | 1622831823
          1 | 1622831823
          1 | 1622831823
          1 | 1622831823
(5 rows)

That output indicates that the earliest and latest XIDs are all indeed 
newer than the ev_minxid for origin = 1.

Another thought...   Is it possible that slons are getting killed off 
often enough that they never get around to finish a cleanup loop?

Something has got to be preventing the purge of old entries from taking 
place; these are all the things that determine what gets purged and how.

I can't pursue this infinitely; hopefully this points in some useful 
directions.  Maybe something's funny with sl_event; that's another place 
to peek at.


More information about the Slony1-general mailing list