Steve Singer ssinger at
Mon Mar 14 06:43:54 PDT 2011
On 11-02-25 04:50 PM, bugzilla-daemon at wrote:
> Christopher Browne<cbbrowne at>  changed:
>             What    |Removed                     |Added
> ----------------------------------------------------------------------------
>           Depends on|176                         |
> --- Comment #7 from Christopher Browne<cbbrowne at>  2011-02-25 13:50:28 PST ---
> I have run pgindent against this, to apply typical PG indentation policy.
> Should be ready for review.

I've looked this over. See below

The paragraphs "It might seem somewhat desirable...." I don't think is
really needed in the documentation.  I think that is more approriate as a
comment in the code.  Users shouldn't be bothered with the reasons why
we didn't make it work some other way.

I would make the paragraphs "In order for timestamps..." and "If the 
host where"
the same paragraph.

- We also capture timestamps in sl_event.  I don't think we  use these
timestamps for much but the fact that we store them here is probably
worth a mention.
-We also log timestamps in the slon log, and syncing them up is annoying
without the timestamps matching between slon servers.

How do I disable the monitoring thread 0? -1? The documentation should
mention this.

co_starttime has a default of now().  We we using the time/clock on the
database server or the time/clock on the server slon runs on.  I don't think
we should be mixing them.


On the UPDATE statement in component_state  the where clause won't
update the row if co_starttime >= i_starttime.  I don't like this.
Let's say the clock stays at the same time for two messages because
NTP is adjusting things.  I would still want the second value to be
stored and overwrite the value from the first call.

A function that is supposed to store something but then doesn't because
one of my parameters (i_starttime) was too small and then returns
without an error return code/exception is also very unexpected.

I sort of hinted at this above.  I'd like a way to disable the monitor
thread ie allow 0 to mean disabled.

Line 91:  Here we delete any rows from sl_components for backends that are
no longer active.  This sort of confuses me.

slon 1 - unix pid 1234
    localListener - connects to backend pid 2345 on host db1
    remoteListenerThread_2  connnects to pid 3456 on host db2
    remoteListenerThread_3  connects to pid 4567 on host db3

3456 and 4567 won't be in pg_stat_activity on host db1.  They are backends
running on host db2.   I'm not sure we need/want this delete statement.

Line 113:  I don't like this #define BEGINQUERY.
I'd rather see
const char * begin_query="start transaction;"
I don't see what the macro buys you that a const char * doesn't and the
const char plays nicer with debuggers and is less suspicious to read.
Having said that on line 85 you create a slon string beginquery.  I'd 
rather see the const char * over macro or the slondstring

line 120: The comment here made me think that something
after stack_pop() was doing the locking and that an explicit
unlock is required.  I'd just skip that comment since stack_pop() always
releases the lock it obtains.

Line 173 AND line 177 :
Free monquery (dstring_terminate)

Line 184:
As I mentioned above I don't think the delete query does what we want. I 
think it probably needs to be eliminated.

Line 195:
I'm not sure I see the point in making COMMITQUERY be an alias for 
"commit;" but if there is one I'd rather see a const char * used.

Line 219:
I think monquery needs to be freed in the while loop (I mention it 
above) not outside of it since you do a dstring_init in the while loop 
(alternativly you could allocate it outside the while loop, free it here 
and just reset it in the loop)

Line 227:
I'd rather see this (#define INITIAL_STACK_SIZE) as a static int defined 
at the top of the file. Easier on debuggers.

Line 246:
actor ,activity and event_type can be declared 'const char *"

Line 318:
ns[len]=(char)0;  would normally be written as ns[len]=NULL; in C.

Line 416:
These functions are not used but my real concern is that they access the 
shared variables without aquiring the mutex and they don't specifiy in 
the comments that the caller is responsible for aquiring the mutex.  If 
we are going to keep the functions we should add a comment to that effect.

A general comment:
This monitor thread calls slon_retry() a bunch of times.  We've created 
a whole bunch of new failure points for slon.  I'm not sure issues 
recording the monitor state should result in a slon falling over. I feel 
that if the monitor thread hits an error it can reset the database 
connections for the monitor thread but it should leave the threads doing 
real work alone.   Having said that we might also want a maximum size of 
the queue.  If the queue exceeds a certain size (maybe because the 
monitor thread keeps hitting issues) then we stop putting new  elements 
on the queue or discard the oldest ones.

I would also like to know what the performance impact in terms of system 
load is of running the monitoring stuff.  Ie run some tests without the 
monitoring stuff measure average cpu load and run the same test with the 
monitoring code and measure average cpu load.  I don't think the impact 
is going to be high but we should still measure it.

More information about the Slony1-hackers mailing list