[Slony1-general] Modifying SLON_DATA_FETCH_SIZE

Thu Jan 17 07:57:46 PST 2008

Note I've changed the subject to make that seem more relevant...

Cyril SCETBON <cyril.scetbon at free.fr> writes:
> Here are the plots of several metrics taken with the original slon
> binary (25 september 2007) and with the new slon binary (including
> modifications of static variables, 16 october 2007)

The graphs aren't quite well enough marked to figure out what's
what...

Is the first one, where "delay" times seem to stay low, but where
there are pretty regular spikes for everything else, the modified
slon, and the second one, where there are fewer but generally rather
higher spikes, represents the "stock" slon?

If what I *think* I am seeing is correct, then there looks as though
there may be some value in increasing the default value for
SLON_DATA_FETCH_SIZE, but I think I need to understand this better.

You might want to comment on some of the lines, and relative
interpretations.  Also, it's not clear what you set
SLON_DATA_FETCH_SIZE to.

What would seem ideal to me would be to have 3 or 4 graphs, where
there is an attempt to keep as many similarities as possible...  Thus,
my "ideal" would involve the following sorts of properties:

 - Preferably, each run would be at about the same time of day, or
   whatever is relevant such that we can expect to see similar
   performance characteristics;

 - It would be nice to have some information on each graph indicating
   amounts of replication traffic (e.g. - numbers of tuples getting
   replicated) so that we can see how well they compare.

 - Multiple runs, and hence multiple graphs, varying on SLON_DATA_FETCH_SIZE:
   - One with the "baseline" of 100
   - One with SLON_DATA_FETCH_SIZE = 400
   - One with SLON_DATA_FETCH_SIZE = 1600
   - One with SLON_DATA_FETCH_SIZE = 3200

  Those numbers are scaling up by a factor of x4 each time, so that we
  can see a pretty wide range of increase.

 - If you can track the size of the slon processes, that may also be
   useful; it might turn out that increasing SLON_DATA_FETCH_SIZE to
   5000 has wonderful performance effects as long as you have 4GB of
   memory available, but that the slon will start swapping if you have
   more modest hardware :-(.

 That may be a lot of work, which I obviously can't make you do :-).
 It may be that further explanations of what is in the two existing
 graphs will tell us enough to come up with a better value than
 SLON_DATA_FETCH_SIZE = 100.

My expectation is that we'll discover that most of the benefits,
e.g. - most of the reshaping of performance, come from the early
increases, and that this comes at only modest memory consumption
costs, and that further improvements would be costlier.

If that turns out to be the case, then the right answer may be to bump
up the default value from 100 to 400, for everyone, and to turn it
into a tunable parameter.  I'd like to have a way to draw that
conclusion!
-- 
let name="cbbrowne" and tld="linuxdatabases.info" in String.concat "@" [name;tld];;
http://cbbrowne.com/info/spiritual.html
'Typos in FINNEGANS WAKE? How could you tell?' -- Kim Stanley Robinson