Sun Aug 1 10:55:09 PDT 2010
- Next message: [Slony1-general] Eek Part II - No, SSL was not the entire problem (Replication COPY fails repeatedly)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
So last night, if you remember my previous missive, I thought I had
found the issue with a big table copy having to do with Postgresql's SQL
support and went to bed with it running with SSL off.
Well, I was wrong, as I was treated to this morning after the thing ran
for more than four hours - probably about the amount of time required to
actually complete the job. (It actually failed TWICE and restarted
overnight.)
Aug 1 06:32:25 dbms TICKER[77422]: [153-1] CONFIG remoteWorkerThread_3:
copy table "public"."images"
Aug 1 06:32:25 dbms TICKER[77422]: [154-1] CONFIG remoteWorkerThread_3:
Begin COPY of table "public"."images"
Aug 1 10:09:08 dbms TICKER[77422]: [155-1] ERROR remoteWorkerThread_3:
copy from stdin on local node - PGRES_FATAL_ERROR server closed the
connection unexpectedly
Aug 1 10:09:08 dbms TICKER[77422]: [155-2] This probably means the
server terminated abnormally
Aug 1 10:09:08 dbms TICKER[77422]: [155-3] before or while
processing the request.
Aug 1 10:09:08 dbms TICKER[77422]: [156-1] WARN remoteWorkerThread_3:
data copy for set 1 failed 1 times - sleep 15 seconds
Aug 1 10:09:08 dbms TICKER[77422]: [157] ERROR remoteWorkerThread_3:
"rollback transaction" PGRES_FATAL_ERROR
Aug 1 10:09:08 dbms TICKER[72097]: [5-1] INFO slon: retry requested
Aug 1 10:09:08 dbms TICKER[72097]: [6-1] INFO slon: notify worker
process to shutdown
The problem is that I really don't have anything untoward in the
postgres log file this time, except for:
Aug 1 11:09:11 tickerforum postgres[39981]: [6-1] LOG: unexpected EOF
on client connection
Aug 1 11:09:11 tickerforum postgres[38657]: [6-1] LOG: unexpected EOF
on client connection
Aug 1 11:09:11 tickerforum postgres[39585]: [6-1] LOG: unexpected EOF
on client connection
Aug 1 11:09:11 tickerforum postgres[39816]: [6-1] LOG: unexpected EOF
on client connection
Aug 1 11:09:11 tickerforum postgres[39191]: [6-1] LOG: unexpected EOF
on client connection
Those APPEAR, from the process IDs, to be the IDs of the SLONs that were
running at the time from the other side, implying that the server didn't
barf, SLONY did and dropped the connection without first saying goodbye.
The problem is that there is literally nothing in the SLON log above or
in the Postgres log implying a problem until the dump.
I had a core dump with a coincident time stamp too on slon, but it was
on the MASTER (not the receiver machine!) and the backtrace was invalid
and thus useless (stack smashed?) If I'm understanding how the process
works correctly from my perusal of the code that doesn't make any sense
since the client SLON is the one that "pulls" the data in this case, and
thus the server SLONs shouldn't be involved in the transfer itself.
Version 2.0.4, Postgres 8.4.4, OS is FreeBSD 8.0/amd64.
Note that this table is quite large (~30GB) and contains BYTEA fields,
with some instances of those fields being very large (megabyte-size
elements are not unusual.) There are no known problems with the data
integrity and the application with the master copy is running fine. A
manual copy of pg_dump of the table in question ("scp'd" over and then
loaded with "psql test-database <file") loads perfectly fine on the
replication target machine, so I'm quite comfortable with the data in
the table itself being fine.
-- Karl
-------------- next part --------------
A non-text attachment was scrubbed...
Name: karl.vcf
Type: text/x-vcard
Size: 124 bytes
Desc: not available
Url : http://lists.slony.info/pipermail/slony1-general/attachments/20100801/09532e09/attachment.vcf
- Next message: [Slony1-general] Eek Part II - No, SSL was not the entire problem (Replication COPY fails repeatedly)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list