Summary: | Monitoring cluster better | ||
---|---|---|---|
Product: | Slony-I | Reporter: | Christopher Browne <cbbrowne> |
Component: | core scripts | Assignee: | Christopher Browne <cbbrowne> |
Status: | RESOLVED FIXED | ||
Severity: | enhancement | CC: | slony1-bugs |
Priority: | low | ||
Version: | devel | ||
Hardware: | All | ||
OS: | All | ||
URL: | http://wiki.postgresql.org/wiki/SlonyBrainstorming#Slon_monitoring | ||
Bug Depends on: | 167 | ||
Bug Blocks: |
Description
Christopher Browne
2010-12-03 13:45:56 UTC
Note that the cluster analysis portion has been put into bug #176 Set up a branch for bug #175 https://github.com/cbbrowne/slony1-engine/tree/bug175 More diagrams that might be useful... - For each set, what nodes are permissible failover targets? - For each set, which nodes are vulnerable to loss on failover? Sample output: -> % psql -d slonyregress2 -c "select * from _slony_regress1.sl_components ;" co_actor | co_pid | co_node | co_connection_pid | co_activity | co_starttime | co_event | co_eventtype ----------------------+--------+---------+-------------------+-------------+------------------------+------------+-------------- local_sync | 1399 | 0 | 1410 | | 2011-02-14 17:37:32+00 | | local_monitor | 1399 | 0 | 1411 | | 2011-02-14 17:37:32+00 | | remote listener | 1399 | 1 | 1412 | | 2011-02-14 17:37:32+00 | | local_cleanup | 1399 | 0 | 1413 | | 2011-02-14 17:37:32+00 | | local_listen | 1399 | 2 | 1403 | SYNC | 2011-02-14 17:37:50+00 | | SYNC remoteWorkerThread_1 | 1399 | 1 | 1409 | SYNC | 2011-02-14 17:37:56+00 | 5000000053 | SYNC (6 rows) Suggestion from Jan... Implement as a stack. When pushing, see if the entry is for "my" thread; if so, update rather than pushing a new entry. I started taking a poke at adding this monitoring to Slonik, and that looks like it's liable to be troublesome and/or overly intrusive. The trouble is that for this to be useful, it needs to generate extra connections to databases, and do many COMMITs during the course of a Slonik script, which is going to lead to a flurry of extra DB connections. I'm not sure but that this is more trouble than the would-be gain. It also looks like it requires quite a lot of extra instrumentation code in slonik.c. So the "first shot" on this involves not changing slonik. I have run pgindent against this, to apply typical PG indentation policy. Should be ready for review. (In reply to comment #7) > I have run pgindent against this, to apply typical PG indentation policy. > > Should be ready for review. Review at http://lists.slony.info/pipermail/slony1-hackers/2011-March/000446.html (In reply to comment #8) > (In reply to comment #7) > > I have run pgindent against this, to apply typical PG indentation policy. > > > > Should be ready for review. > > Review at > > http://lists.slony.info/pipermail/slony1-hackers/2011-March/000446.html I have responded to this on the list; see: http://lists.slony.info/pipermail/slony1-hackers/2011-March/000447.html There are patches for virtually all of the items mentioned in the review: https://github.com/cbbrowne/slony1-engine/commit/c6082bc77f460e985c7a2891c45269c9648347c9 https://github.com/cbbrowne/slony1-engine/commit/72ed0337641eefda095cfcb706562d71d9c03413 https://github.com/cbbrowne/slony1-engine/commit/7a3c5f54dcf72bcf67a49f6ed5b80ed673fc0ed5 https://github.com/cbbrowne/slony1-engine/commit/0543c96506eb55012c4366f78009f886b9464d59 https://github.com/cbbrowne/slony1-engine/commit/03f52daf1fb7ae2d819ecd0c411342a594f6884d https://github.com/cbbrowne/slony1-engine/commit/d00574d2d1793a372467bc103a7772030fd939ab There were two other issues mentioned which I'll quote as items to specifically respond to. Per Steve Singer: <http://lists.slony.info/pipermail/slony1-hackers/2011-March/000446.html> A general comment: This monitor thread calls slon_retry() a bunch of times. We've created a whole bunch of new failure points for slon. I'm not sure issues recording the monitor state should result in a slon falling over. I feel that if the monitor thread hits an error it can reset the database connections for the monitor thread but it should leave the threads doing real work alone. Having said that we might also want a maximum size of the queue. If the queue exceeds a certain size (maybe because the monitor thread keeps hitting issues) then we stop putting new elements on the queue or discard the oldest ones. Per Steve Singer: <http://lists.slony.info/pipermail/slony1-hackers/2011-March/000446.html> I would also like to know what the performance impact in terms of system load is of running the monitoring stuff. Ie run some tests without the monitoring stuff measure average cpu load and run the same test with the monitoring code and measure average cpu load. I don't think the impact is going to be high but we should still measure it. Per Steve's recommendation, monitoring thread problems aren't necessarily serious problems warranting restarting the slon. This patch implements that: https://github.com/cbbrowne/slony1-engine/commit/d1a5be153e51cd10ac12a3dcb9cbb63856a18306 Note that there are still some slon_retry() calls remaining, in the scenario where the slon process runs out of memory. That seems pretty fatal to me... I realized that there are cases where we have a horrid memory leak; if the monitoring thread isn't working, then we shouldn't be collecting monitoring entries. Thus, made two changes: 1. When we exit from the monitoring thread main loop, turn monitoring off. 2. In the "push" function, if monitoring is shut off, return immediately, do not attempt to queue up monitoring data. https://github.com/cbbrowne/slony1-engine/commit/a61f4642ed29bc727d2361a4000500b81573ea11 I realized that the one entry for the monitoring thread will indicate the time when the slon daemon started. Document that... https://github.com/cbbrowne/slony1-engine/commit/d3f66cccb54168ca8d6a80261337a66228720cd8 I don't see a particularly good way to readily analyze CPU loading from this addition, and I'm not sure that's even the best measurement, in any case. As an alternative, I instrumented the code in two places, with the following patch: diff --git a/src/slon/monitor_thread.c b/src/slon/monitor_thread.c index 108b338..d88abfd 100644 --- a/src/slon/monitor_thread.c +++ b/src/slon/monitor_thread.c @@ -180,6 +180,8 @@ monitorThread_main(void *dummy) free(state.activity); if (state.event_type != NULL) free(state.event_type); + slon_log(SLON_INFO, "monitor_thread - record - %s\n", + dstring_data(&monquery)); res = PQexec(dbconn, dstring_data(&monquery)); if (PQresultStatus(res) != PGRES_TUPLES_OK) { @@ -384,6 +386,9 @@ monitor_state(const char *actor, int node, pid_t conn_pid, /* @null@ */ const ch { tos->event_type = NULL; } + slon_log(SLON_INFO, "monitor_state(pid:%d node:%d connpid:%d event:%lld actor[%s] activity[%s] event_type[%s]\n", + tos->pid, tos->node, tos->conn_pid, tos->event, + tos->actor, tos->activity, tos->event_type); pthread_mutex_unlock(&stack_lock); } a) Each time an entry is pushed onto the stack, it is recorded in the log b) Each time an entry is to be submitted to the DB, the query is recorded in the log. In running a typical regression test (e.g. - test1), I saw the following results... Test ran from Mar 17 13:18:00 to Mar 17 13:19:06 grep monitor_thread /var/log/messages | grep select | wc -l -> There were 214 calls to the component_state() function, across two nodes, so... 214 / (66*2) = 1.6 calls per second -> There were 471 calls to monitor_state(), across those nodes, that induced these 214 calls. That suggests to me that Jan's recommendation to optimize by allowing successive calls to overwrite a repeated entry for the same thread is doing a lot of good, reducing the DB load by more than half. I find it compelling to expect this to be a pretty nominal addition to system load for Slony. Poking at the table: slonyregress1@localhost-> analyze verbose _slony_regress1.sl_components ; INFO: analyzing "_slony_regress1.sl_components" INFO: "sl_components": scanned 1 of 1 pages, containing 6 live rows and 17 dead rows; 6 rows in sample, 6 estimated total rows ANALYZE slonyregress1@localhost-> \c slonyregress2 You are now connected to database "slonyregress2" as user "postgres". slonyregress2@localhost-> analyze verbose _slony_regress1.sl_components ; INFO: analyzing "_slony_regress1.sl_components" INFO: "sl_components": scanned 1 of 1 pages, containing 6 live rows and 11 dead rows; 6 rows in sample, 6 estimated total rows ANALYZE Thing remembered... sl_components should be added to the set of tables that the cleanup thread considers maintaining. https://github.com/cbbrowne/slony1-engine/commit/cd3519bba5a03ce2f577747324f4a9ebc697d317 Some additional comments based on the latest changes (these are mostly issues that I just didn't catch in my first review) 1. We also need to include support in the upgradeSchema() function to take a 2.0.x slony install and add in the new tables for monitoring 2. In monitor thread - dbconn is never closed. Especially in the cases where we 'stop' the monitoring thread we should shutdown the connection (if we are able). We also MIGHT want to think about emptying the queue at this point. We aren't going to empty it later and if the database connection was stuck for a while before it timed out leading to the error the queue might have grown in size. I also still wonder if we shouldn't be putting a maximum size on the queue, I know we discussed this the other day and it started to sound complicated. I think there must be a way of doing this with throwing away old entries that isn't complicated. If I can find some time I'll see if I can come up with something. Until then I don't think that needs to hold this patch up. I also bet we will get complaints from people "my slon has been running fine for 3 months but slon_component table stopped updating a few weeks ago". Where some network blimp disabled the monitor_thread but the rest of slony continued to chug along. Though i think this situation is better htan slon restarting as we had before. 3. Line 189 dstring_free(monquery) before breaking out of the loop on the error. The discussion on how we should be searching for performance regressions in slony is maybe outside the scope of this bug other than we should deal with this before we release 2.1 (looking to see if any of the 2.1 changes are a performance issue. I have ideas on how to do this but Jan might already have a setup for testing this). (In reply to comment #15) > Some additional comments based on the latest changes (these are mostly issues > that I just didn't catch in my first review) > > > 1. We also need to include support in the upgradeSchema() function to take a > 2.0.x slony install and add in the new tables for monitoring I thought I had done that. I'll add in the table if it's missing. https://github.com/cbbrowne/slony1-engine/commit/d51049b54f014ee4f00f974fc9629a7d64ca9727 > 2. In monitor thread - dbconn is never closed. Especially in the cases where > we 'stop' the monitoring thread we should shutdown the connection (if we are > able). We also MIGHT want to think about emptying the queue at this point. We > aren't going to empty it later and if the database connection was stuck for a > while before it timed out leading to the error the queue might have grown in > size. dbconn isn't closed directly - it's closed via slon_disconnect(conn). As for emptying the queue, I don't see much value in it. The condition on the loop is on the waits returning SCHED_STATUS_OK. I don't think you get to ask it to terminate, or is there a signal you could throw that would kick just the one thread? > I also still wonder if we shouldn't be putting a maximum size on the queue, I > know we discussed this the other day and it started to sound complicated. I > think > there must be a way of doing this with throwing away old entries that isn't > complicated. If I can find some time I'll see if I can come up with something. > Until then I don't think that needs to hold this patch up. Idle thought: pick two numbers, m and n, and, once we've reached size m, generate a warning every time the queue size mod n = 0. For instance, m=100, n=10. Once the queue is of size >100, complain every time the size increases and (size % 10) = 0. Actually, we're doubling the size of the array each time, so it's perfectly reasonable to gripe in the logs any time we resize and the size > 100. I'll do that... > I also bet we will get complaints from people > "my slon has been running fine for 3 months but slon_component table stopped > updating a few weeks ago". Where some network blimp disabled the > monitor_thread but the rest of slony continued to chug along. Though i think > this situation is better htan slon restarting as we had before. That's the risk any time we add anything :-). > 3. Line 189 dstring_free(monquery) before breaking out of the loop on the > error. Fixed. https://github.com/cbbrowne/slony1-engine/commit/24da0a137120f4032858230836b39ed14da755fb > The discussion on how we should be searching for performance regressions in > slony is maybe outside the scope of this bug other than we should deal with > this before we release 2.1 (looking to see if any of the 2.1 changes are a > performance issue. I have ideas on how to do this but Jan might already have a > setup for testing this). I'm happy with that. I have done a little bit of analytics that suggest that this change should have minimal impact on performance, and that's certainly what we expected. And it seems to make sense to look at performance more globally. It's acceptable for us to lose a little here and there on one feature or another if: a) We gain material reliability improvements (e.g. - reasonable value exchange) b) There are other changes that provide offsetting performance increases. (Bug #167, I'm looking at you! :-)) > > > 2. In monitor thread - dbconn is never closed. Especially in the cases where > > we 'stop' the monitoring thread we should shutdown the connection (if we are > > able). We also MIGHT want to think about emptying the queue at this point. We > > aren't going to empty it later and if the database connection was stuck for a > > while before it timed out leading to the error the queue might have grown in > > size. > > dbconn isn't closed directly - it's closed via slon_disconnect(conn). Didn't see that. We then need to call slon_disconnect on line 92 > > As for emptying the queue, I don't see much value in it. The condition on the > loop is on the waits returning SCHED_STATUS_OK. I don't think you get to ask > it to terminate, or is there a signal you could throw that would kick just the > one thread? > Lines: 121, 191, and to a lesser extend 212 and 95 are all on code paths where we break out of the loop but don't free the space already used by the queue. > Actually, we're doubling the size of the array each time, so it's perfectly > reasonable to gripe in the logs any time we resize and the size > 100. I'll do > that... For the moment I guess this is fine to commit, At some point I will try to find time to implement a properly bounded queue based on the number of threads such that old entries get thrown away when the queue hits the bound size. |