Mon Jan 29 18:34:36 PST 2007
- Previous message: [Slony1-general] Quizzical merge set error
- Next message: [Slony1-general] getting postgresql server crashes with slony
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,
I'm having some serious problems ever since I installed slony
(earlier today). Let me give some background information first.
Today I installed slony (1.2.6) on a 8.1.4 server (node 1) and am
replicating to a 8.2.1 (node 2) server. It's a simple master (node
1) -> slave (node 2) setup. Nothing fancy. I'm getting ready for
migrating to 8.2 and I was planning on using slony to do it. I've
used slony with mixed success in the past, and I thought I'd give it
another go. Of course everything went fine in our test environment ;).
Postgresql has been happily running without a problem. Never has
crashed since I installed 8.1. This server does about 20 million
queries a day, so it's not sitting around idle. Most of them are
selects, but there is a constant trickle of updates, inserts and
deletes. Within 2 hours of starting up slony, node 1 has crashed
twice. The postgres logs are like such:
LOG: server process (PID 29842) was terminated by signal 11
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server
process
DETAIL: The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
.
.
.
HINT: In a moment you should be able to reconnect to the database
and repeat your command.
LOG: all server processes terminated; reinitializing
LOG: database system was interrupted at 2007-01-29 19:45:03 CST
LOG: checkpoint record is at 37C/C5F428E4
.
.
.
The node 1 slon log is as follows when the crash happened:
2007-01-29 19:42:25 CST DEBUG1 cleanupThread: 0.312 seconds for
delete logs
2007-01-29 19:45:09 CST FATAL syncThread: "start transaction;set
transaction isolation level serializable;select last_value from
"_mobycluster".sl_action_seq;" - server closed the connection
unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2007-01-29 19:45:09 CST DEBUG1 slon: retry requested
2007-01-29 19:45:09 CST INFO remoteListenThread_2: disconnecting
from '***************'
The node 2 slon log is as follows when the crash happened:
2007-01-29 19:42:16 CST DEBUG1 cleanupThread: 0.196 seconds for
delete logs
2007-01-29 19:45:09 CST ERROR remoteListenThread_1: "select
ev_origin, ev_seqno, ev_timestamp, ev_minxid, ev_maxxid,
ev_xip, ev_type, ev_data1, ev_data2, ev_data3,
ev_data4, ev_data5, ev_data6, ev_data7, ev_data8 from
"_mobycluster".sl_event e where (e.ev_origin = '1' and e.ev_seqno >
'8720') order by e.ev_origin, e.ev_seqno" - server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2007-01-29 19:45:19 CST DEBUG1 remoteListenThread_1: connected to
'****************'
2007-01-29 19:45:26 CST ERROR remoteWorkerThread_1: "start
transaction; set enable_seqscan = off; set enable_indexscan = on; "
PGRES_FATAL_ERROR server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2007-01-29 19:45:26 CST ERROR remoteWorkerThread_1: "close LOG; "
PGRES_FATAL_ERROR 2007-01-29 19:45:26 CST ERROR
remoteWorkerThread_1: "rollback transaction; set enable_seqscan =
default; set enable_indexscan = default; " PGRES_FATAL_ERROR
2007-01-29 19:45:26 CST ERROR remoteWorkerThread_1: helper 1
finished with error
Is it possible that slony is causing these crashes? I think that
since $libdir/slony1_funcs.so is being included in the postgres
processes, it's certainly possible. I also think that a coincidence
is a little to much of a reach. However, I would love to hear what
the experts think. What's the best way to track this down? Any
advice on what I should do? I'm very close to uninstalling slony,
but if there is something I can do to help identify the problem so
that it can be fixed, I'd like to help.
I realize that I'm not running 8.1.6, and I've checked the release
notes for .5 and .6 and only see one reference to a crash fix:
"Disallow aggregate functions in UPDATE commands, except within sub-
SELECTs (Tom)" I'm centainly not doing this kind up update, and I'm
pretty sure slony isn't either.
On a much more benign note, I keep seeing a bunch of these in both
slon logs for node 1 and node 2.
NOTICE: Slony-I: cleanup stale sl_nodelock entry for pid=8572
CONTEXT: SQL statement "SELECT "_mobycluster".cleanupNodelock()"
PL/pgSQL function "cleanupevent" line 77 at perform
Are they something to be worried about?
Thanks for any help in advance.
Brian Hirt
- Previous message: [Slony1-general] Quizzical merge set error
- Next message: [Slony1-general] getting postgresql server crashes with slony
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list