Bug 300

Summary: Would like to have some activity captured
Product: Slony-I Reporter: Christopher Browne <cbbrowne>
Component: slonAssignee: Christopher Browne <cbbrowne>
Status: ASSIGNED ---    
Severity: enhancement CC: slony1-bugs
Priority: low    
Version: devel   
Hardware: All   
OS: All   

Description Christopher Browne 2013-07-10 12:44:27 UTC
This parallels the sl_components table, except being a capture of interesting events, not interesting state

Add some statistics capture to a new log table

- When did the node start up - uptime
- When were some log switches
- Last SYNC per remote node ID
- Last timestamp of a new seen event from a node
Comment 1 Christopher Browne 2013-07-11 13:47:40 UTC
A "first blush" idea is to add a log table that looks a lot like sl_components, but augmented to provide a log rather than a per-component state, and change the function component_state() to add a log entry in addition to capturing the present state.

The cleanup thread would be augmented to delete old entries from the log.  In effect, for each actor/event, delete all but the last N entries (say, all but the last 10 entries, 10 being a configurable value).

That would cover uptime, SYNC stats, and the last timestamp, as suggested.

It would NOT capture when log switches actually take place; at present, that logic takes place inside the function CleanupEvent(), so isn't directly captured by the monitoring thread.  It should be easy enough to add logic to CleanupEvent() which adds an extra log entry when a log switch actually does take place, in effect, capturing the information inside the DB layer.
Comment 2 Christopher Browne 2013-07-11 15:06:20 UTC
An initial swing at this...


Things added:

a) A new table, sl_eventlog

b) A function that adds to it, logEvent()

c) Injected logEvent() into existing function component_state(), so that the existing monitoring of state pushes that data into sl_eventlog

d) Injected some calls into logswitch_finish() so that log switches get captured

e) Add a cleanup function, trimOldLoggedEvents()

For each actor/eventtype, it trims out all entries older than the last N, where N is currently set to 10.

f) Add a call to that function to cleanupEvents(), so that the log gets trimmed.