Bug 40 - libpq thread test insufficient
Summary: libpq thread test insufficient
Status: RESOLVED FIXED
Alias: None
Product: Slony-I
Classification: Unclassified
Component: slon (show other bugs)
Version: 2.0
Hardware: All Linux
: high normal
Assignee: Christopher Browne
URL:
Depends on:
Blocks:
 
Reported: 2008-03-10 07:24 UTC by Peter Eisentraut
Modified: 2010-09-13 13:42 UTC (History)
1 user (show)

See Also:


Attachments
additional proposed patch (1.87 KB, patch)
2010-09-07 15:34 UTC, Steve Singer
Details
Add a runtime check for libpq threading (868 bytes, patch)
2010-09-08 11:38 UTC, Steve Singer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Eisentraut 2008-03-10 07:24:48 UTC
config/acx_libpq.m4 is apparently of the opinion that it only needs to verify that libpq is compiled thread-safely on aix, mingw, and solaris.  I think that is insufficient.  I have found a posting about this on the mailing list:

"""
Date: Donnerstag, Juni 14, 2007 11:26:04 -0400
From: Andrew Sullivan <ajs@crankycanuck.ca>
To: slony1-general@gborg.postgresql.org
Cc:
Subject: Re: [Slony1-general] idle in transaction on subscriber

On Thu, Jun 14, 2007 at 04:19:20PM +0200, Jacques Caron wrote:
> PostgreSQL library (libpq) was not compiled with
> --enable-thread-safety, and this causes random breakage in slon. The

You're not actually supposed to need to use that switch on platforms
where it's not checked, on the grounds that thread safety is _always_
on there.  What's your platform and compilier?  It may be that we
need more tests.
"""

It is definitely wrong to assume that platforms other than those listed have thread-safety always enabled.  Linux is a counterexample.

I think it would be most prudent to require thread-safety on all platforms except those where you have documented evidence that it is not necessary.  I don't know any among the more common Unix-like platforms that would fit that description, however.
Comment 1 Christopher Browne 2008-03-12 09:02:40 UTC
It may be insufficient, but there hasn't been a specific problem case reported yet.

If someone discovers that [Platform X] requires threading enabled, we can in the short term tell them they need that, and add that platform to the list that we know about.

I don't see particular value in pursuing this theoretical insufficiency when it is something that may be fairly easily addressed.
Comment 2 Jan Wieck 2010-06-02 12:23:18 UTC
We will enable this for 2.0 as soon as the 2.0.4 release is out.
Comment 3 Christopher Browne 2010-08-19 10:19:14 UTC
I have set up a branch (of REL_2_0_STABLE) for this:

http://github.com/cbbrowne/slony1-engine/commits/bug40

As per Peter's comments, I'm treating Linux as the only platform where we assume that libpq is compiled thread-safely.  If there are other platforms where that should be deemed to be the case, please let me know.
Comment 4 Steve Singer 2010-08-24 11:44:53 UTC
This seems  okay to me.
I'm taking it at someones word that Linux doesn't need --enable-thread-safety
Comment 5 Peter Eisentraut 2010-08-24 11:59:20 UTC
There is possibly some confusion here.  --enable-thread-safety always creates a functionally different libpq on any platform.  And a libpq compiled without that flag is not suitable for multithreaded applications on any platform.  At least a glance at the source code indicates as much.  It might be that you don't actually need the parts in question, but I'm not sure.

What you might be thinking of is that on some platforms (including Linux, excluding BSD, I think), most/all libc functions are reentrant by default, without the need for special compiler flags.  But that just means that the --enable-thread-safety flag has to do less work, it doesn't make the entire library thread-safe automatically.
Comment 6 Jan Wieck 2010-08-24 12:48:11 UTC
Correct, Peter.

Slony is multi threaded, but never do two threads use the same database connection or any object derived there of. Maybe that is why it does work on some platforms that don't actually produce a 100% thread safe libpq without --enable-thread-safety.

What I would suggest is that we collect a slightly longer list of OS's, where this is irrelevant for Slony and then make this a test with 3 possible outcomes. Error out if --enable-thread-safety wasn't used where we know it needs to be (aix, solaris), Don't care if it was (Linux, FreeBSD) and give a WARNING for everything else.

This has IMHO time until 2.0.6
Comment 7 Steve Singer 2010-08-24 13:01:04 UTC
While slony does not have multiple threads using the same libpq connection it does have multiple threads accessing libpq on different connections at the same time.

Does this mean that we do need to be running with --enable-thread-saftey ?

I haven't looked very closely at the parts of libpq that change based on ENABLE_THREAD_SAFETY and it looks like we do provide callbacks to openssl for locking.   I suspect slony needs this on ALL platforms to be safe.

There have been reports I've seen of slony replication breaking over ssl connections maybe this could have been related.

I think we should modify the patch to require slony to be built with a threadsafe libpq on all platforms.
Comment 8 Jan Wieck 2010-08-24 13:04:12 UTC
After looking at some of the libpq code I have to agree with that.

This is a heavy handed change inside of a stable branch, but it looks safer.


Jan
Comment 9 Christopher Browne 2010-08-24 13:21:39 UTC
(In reply to comment #8)
> After looking at some of the libpq code I have to agree with that.
> 
> This is a heavy handed change inside of a stable branch, but it looks safer.

I have updated the patch to be "more heavy handed."

http://github.com/cbbrowne/slony1-engine/commit/f9a90879df8ce4764cbdcc380c1dd8e7a5623dae

I'd be concerned that this may adversely affect folks on Linux, in the scenario where we're requiring "ETS" which then requires people to rebuild packages.

- I took a peek at Debian, and the pg_config there does indicate --enable-thread-safety, so it should be safe-ish.

- Ubuntu derives from Debian, and the Ubuntu instance I have handy does "ETS" too.

I don't have Fedora, RHAS/RHES, SuSE handy to me.
Comment 10 Christopher Browne 2010-08-24 15:28:25 UTC
(In reply to comment #9)
> - I took a peek at Debian, and the pg_config there does indicate
> --enable-thread-safety, so it should be safe-ish.
> 
> - Ubuntu derives from Debian, and the Ubuntu instance I have handy does "ETS"
> too.

Note: Jan took a look at FreeBSD, and found that Ports indicates ETS, so FreeBSD shouldn't be adversely affected.
Comment 11 Steve Singer 2010-08-25 11:19:10 UTC
Conclusion: Go!
Comment 12 Christopher Browne 2010-08-25 12:48:19 UTC
Here is the patch that I propose to put into 2.0/HEAD:

http://github.com/cbbrowne/slony1-engine/commit/7e74997c0f8427c08c2885de2a3218ca333da632

Please verify, and pass back to me.
Comment 13 Peter Eisentraut 2010-08-27 13:25:02 UTC
--enable-thread-safety is the default in PG 9.0, so it might be enabled even though you don't see the option in `pg_config --configure`.
Comment 14 Steve Singer 2010-08-30 07:35:49 UTC
Chris,

Per Peter's comment, the patch fails when we try to build against a 9.0 with default(ish) options to ./configure (for postgres) even though PG is built with threading.

What I think we need to do is have configure compile up a program that links against libpq and calls 
int PQisthreadsafe();
and make sure the return code is 1.
Comment 15 Christopher Browne 2010-08-30 09:36:10 UTC
(In reply to comment #14)
> Chris,
> 
> Per Peter's comment, the patch fails when we try to build against a 9.0 with
> default(ish) options to ./configure (for postgres) even though PG is built with
> threading.
> 
> What I think we need to do is have configure compile up a program that links
> against libpq and calls 
> int PQisthreadsafe();
> and make sure the return code is 1.

OK, I have added a test that seems to work.

http://github.com/cbbrowne/slony1-engine/commit/99507a92d54063d038ef572bcfe78916f589886d

I do a bit of trickery with LIBS, which I don't particularly like.

The test is certainly open to improvement.
Comment 16 Jan Wieck 2010-08-30 10:30:22 UTC
We need to make sure that this test is building the test program against the libpq-fe that is pointed out by pg_config. Other than that, asking the library itself is IMHO the right thing.
Comment 17 Christopher Browne 2010-08-30 11:52:38 UTC
(In reply to comment #16)
> We need to make sure that this test is building the test program against the
> libpq-fe that is pointed out by pg_config. Other than that, asking the library
> itself is IMHO the right thing.

Yes, we surely are.  The values of LIBS and such are used; extracting from config.log, for my install:

configure:5523: checking PostgreSQL for thread-safety
configure:5545: gcc -o conftest -g -O2  -I/var/lib/postgresql/dbs/postgresql-8.4/include/ -I/var/lib/postgresql/dbs/postgresql-8.4/include/server/  -L/var/lib/postgresql/dbs/postgresql-8.4/lib/ conftest.c  -lpq >&5
Comment 18 Steve Singer 2010-08-30 12:17:36 UTC
I've tested this on AIX and the patch seems to behave as expected.

On linux when i compile with --disable-thread-safety it still 'seems' to give me a libpq that has thread safety enabled.  Maybe this is where the 'ignore on linux' behaviour we previously in the check came from?
Comment 20 Steve Singer 2010-09-07 15:32:48 UTC
I am reopening this.

If libpq isn't installed in a system path (ie /usr/lib) then the configure time check fails.  Furthermore while it links against the correct libpq at runtime it might pick up the wrong libpq.

Patch proposed at http://github.com/ssinger/slony1-engine/commit/aeeaae2692930536961e3fe013a0ac7f14cd91a4
Comment 21 Steve Singer 2010-09-07 15:34:37 UTC
Created attachment 64 [details]
additional proposed patch

A proposed patch to get the test program to find the correct libpq at test runtime
Comment 22 Jan Wieck 2010-09-08 11:16:02 UTC
This all leads me to think that a test during the startup of slon itself to double check that it isn't using the "wrong" libpq at runtime would be a good idea too.
Comment 23 Steve Singer 2010-09-08 11:38:41 UTC
Created attachment 65 [details]
Add a runtime check for libpq threading
Comment 24 Steve Singer 2010-09-08 11:39:28 UTC
A patch to do the runtime check is attached and available at 
http://github.com/ssinger/slony1-engine/commit/e7959571d8975414e05a19acdcefdd20683c1014
Comment 25 Christopher Browne 2010-09-08 11:59:40 UTC
(In reply to comment #24)
> A patch to do the runtime check is attached and available at 
> http://github.com/ssinger/slony1-engine/commit/e7959571d8975414e05a19acdcefdd20683c1014

I like the idea...  I don't think we need to prescribe --enable-thread-safety, just to mention that is what frequently is lacking.

This patch changes your messages a bit.

http://github.com/cbbrowne/slony1-engine/commit/3d40cceaf6fd442fa3653c85616f542f4a1f3d96
Comment 26 Steve Singer 2010-09-13 13:42:56 UTC
Fix applied
2ee266a20be37964b6aa0df1554a0b5998fbba4a