59 – slony doesn't escape varchar fields properly, making it impossible to replicate varchar or bytea columns in some cases

Bug 59 - slony doesn't escape varchar fields properly, making it impossible to replicate varchar or bytea columns in some cases

Summary: slony doesn't escape varchar fields properly, making it impossible to replica...

Status:	ASSIGNED

Alias:	None

Product:	Slony-I
Classification:	Unclassified
Component:	slon (show other bugs)
Version:	devel
Hardware:	PC Linux

Importance:	medium normal
Assignee:	Slony Bugs List

URL:

Depends on:
Blocks:

Reported:	2008-10-13 09:15 UTC by Grzegorz Jaskiewicz
Modified:	2010-08-13 19:41 UTC (History)
CC List:	1 user (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Grzegorz Jaskiewicz 2008-10-13 09:15:44 UTC

I have a very simple table:

create table foo(z varchar not null, bar int not null, primary key(z));

now, I try to insert bunch of data to it using:
insert into foo(bar, z) select generate_series(1,100) as bar, encode( gen_random_bytes(100), 'escape') as z;

on slon's insert db will reply:
ERROR:  invalid byte sequence for encoding "UTF8": 0xdf15

which is quite right, since data is not escaped. I am inserting random data, on purpose - because we do have that in here, and it works fine with my C code, that so far replicated the data.

Further more, I think if slony would use PQexecParams instead of PQexec to insert data, that problem wouldn't arose in first place.

Comment 1 Christopher Browne 2008-10-15 08:07:04 UTC

Followup question:

What are the encodings of the two databases (e.g. - origin + subscriber)?

My suspicion is that either:

a) Both use UTF-8, but the origin is on PostgreSQL 7.4 or 8.0, which had problems with its handling of Unicode encoding validation, or

b) The origin is using SQL-ASCII or similar, and the subscriber is using Unicode/UTF-8.

If either of those speculations are in fact the case, then this may NOT be a bug; Slony-I does not promise to do inter-encoding translations.

Comment 2 Christopher Browne 2010-06-02 12:26:35 UTC

Question to consider:

Should the slon verify that it uses a single common encoding on all DB connections?

Comment 3 Christopher Browne 2010-06-02 12:34:00 UTC

We intend to address this in a different way, by ensuring that slon processes are using the same client encoding on all databases.

Comment 4 Jan Wieck 2010-08-13 19:41:55 UTC

Changing importance to "normal". This can be avoided by using

alter user slony set client_encoding to <common-encoding>;

on all servers in the Slony cluster and NOT having any of the databases use server_encoding SQL_ASCII (which is discouraged anyway).

We still want to fix it, but since this is a completely new feature, we won't implement it before 2.1.