Tue Mar 15 10:12:16 PDT 2011
- Previous message: [Slony1-hackers] [Slony1-bugs] [Bug 175] Monitoring cluster better
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, Mar 14, 2011 at 9:43 AM, Steve Singer <ssinger at ca.afilias.info> wrote: > On 11-02-25 04:50 PM, bugzilla-daemon at main.slony.info wrote: >> >> http://www.slony.info/bugzilla/show_bug.cgi?id=175 >> >> Christopher Browne<cbbrowne at ca.afilias.info> changed: >> >> What |Removed |Added >> >> ---------------------------------------------------------------------------- >> Depends on|176 | >> >> --- Comment #7 from Christopher Browne<cbbrowne at ca.afilias.info> >> 2011-02-25 13:50:28 PST --- >> I have run pgindent against this, to apply typical PG indentation policy. >> >> Should be ready for review. >> > > I've looked this over. See below > > > > > monitoring.sgml > --------------------- > The paragraphs "It might seem somewhat desirable...." I don't think is > really needed in the documentation. I think that is more approriate as a > comment in the code. Users shouldn't be bothered with the reasons why > we didn't make it work some other way. > > > I would make the paragraphs "In order for timestamps..." and "If the host > where" > the same paragraph. No objections https://github.com/cbbrowne/slony1-engine/commit/72ed0337641eefda095cfcb706562d71d9c03413 > prerequisites.sgml > ------------------ > - We also capture timestamps in sl_event. I don't think we use these > timestamps for much but the fact that we store them here is probably > worth a mention. > -We also log timestamps in the slon log, and syncing them up is annoying > without the timestamps matching between slon servers. No objections https://github.com/cbbrowne/slony1-engine/commit/72ed0337641eefda095cfcb706562d71d9c03413 > slonconf.sgml > ---------------- > How do I disable the monitoring thread 0? -1? The documentation should > mention this. > > slony1_base.sql > ------------------- > co_starttime has a default of now(). We we using the time/clock on the > database server or the time/clock on the server slon runs on. I don't think > we should be mixing them. No objections https://github.com/cbbrowne/slony1-engine/commit/c6082bc77f460e985c7a2891c45269c9648347c9 > slony1_funcs.sql > ----------------------- > > On the UPDATE statement in component_state the where clause won't > update the row if co_starttime >= i_starttime. I don't like this. > Let's say the clock stays at the same time for two messages because > NTP is adjusting things. I would still want the second value to be > stored and overwrite the value from the first call. > > A function that is supposed to store something but then doesn't because > one of my parameters (i_starttime) was too small and then returns > without an error return code/exception is also very unexpected. Unfortunately, we're working with a *stack*, which means we pull entries off in reverse order. It's definitely troublesome to change this. Per conversation, I'll not change this for now... > confoptions.c > -------------------- > I sort of hinted at this above. I'd like a way to disable the monitor > thread ie allow 0 to mean disabled. Yep, will do. > monitor_thread.c > ------------------------------ > Line 91: Here we delete any rows from sl_components for backends that are > no longer active. This sort of confuses me. > > slon 1 - unix pid 1234 > localListener - connects to backend pid 2345 on host db1 > remoteListenerThread_2 connnects to pid 3456 on host db2 > remoteListenerThread_3 connects to pid 4567 on host db3 > > 3456 and 4567 won't be in pg_stat_activity on host db1. They are backends > running on host db2. I'm not sure we need/want this delete statement. Removed. > Line 113: I don't like this #define BEGINQUERY. > I'd rather see > const char * begin_query="start transaction;" > I don't see what the macro buys you that a const char * doesn't and the > const char plays nicer with debuggers and is less suspicious to read. > Having said that on line 85 you create a slon string beginquery. I'd rather > see the const char * over macro or the slondstring Fair enough. > line 120: The comment here made me think that something > after stack_pop() was doing the locking and that an explicit > unlock is required. I'd just skip that comment since stack_pop() always > releases the lock it obtains. Fair enough. > Line 173 AND line 177 : > Free monquery (dstring_terminate) Added an else clause for this. > Line 184: > As I mentioned above I don't think the delete query does what we want. I > think it probably needs to be eliminated. Removed. > Line 195: > I'm not sure I see the point in making COMMITQUERY be an alias for "commit;" > but if there is one I'd rather see a const char * used. Changed to slon dstring, as with begin; > Line 219: > I think monquery needs to be freed in the while loop (I mention it above) > not outside of it since you do a dstring_init in the while loop > (alternativly you could allocate it outside the while loop, free it here and > just reset it in the loop) Yep, it's allocated and deallocated inside the while loop. I have added a free on commitquery, to parallel the one for beginquery. > Line 227: > I'd rather see this (#define INITIAL_STACK_SIZE) as a static int defined at > the top of the file. Easier on debuggers. Done. > Line 246: > actor ,activity and event_type can be declared 'const char *" Good call, done. > Line 318: > ns[len]=(char)0; would normally be written as ns[len]=NULL; in C. Hmm. Warning here.. monitor_thread.c: In function 'monitor_state': monitor_thread.c:307: warning: assignment makes integer from pointer without a cast This isn't really stating a pointer; we are just wanting to assign a string null value, which isn't quite the same as pointer NULL. Using... ns[len] = '\0'; > Line 416: > These functions are not used but my real concern is that they access the > shared variables without aquiring the mutex and they don't specifiy in the > comments that the caller is responsible for aquiring the mutex. If we are > going to keep the functions we should add a comment to that effect. I'll add the mutex to stack_dump(), and comment on the need for the mutex to guard entry_dump(). Stopping here, for the moment... The above issues are addressed via patches up to https://github.com/cbbrowne/slony1-engine/commit/d00574d2d1793a372467bc103a7772030fd939ab I'll attach the following two issues as two further comments to the bug. > A general comment: > This monitor thread calls slon_retry() a bunch of times. We've created a > whole bunch of new failure points for slon. I'm not sure issues recording > the monitor state should result in a slon falling over. I feel that if the > monitor thread hits an error it can reset the database connections for the > monitor thread but it should leave the threads doing real work alone. > Having said that we might also want a maximum size of the queue. If the > queue exceeds a certain size (maybe because the monitor thread keeps hitting > issues) then we stop putting new elements on the queue or discard the > oldest ones. > > I would also like to know what the performance impact in terms of system > load is of running the monitoring stuff. Ie run some tests without the > monitoring stuff measure average cpu load and run the same test with the > monitoring code and measure average cpu load. I don't think the impact is > going to be high but we should still measure it.
- Previous message: [Slony1-hackers] [Slony1-bugs] [Bug 175] Monitoring cluster better
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-hackers mailing list