Fri Jun 13 07:44:49 PDT 2008
- Previous message: [Slony1-general] CREATE SET did not reach every node
- Next message: [Slony1-general] How do I drop rule from replicated table?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Andrew Sullivan a écrit :
> On Fri, Jun 13, 2008 at 12:09:01PM +0200, "Stéphane A. Schildknecht" wrote:
>> Unfortunately, one of the node (72, see below) did not know about that set, and
>> subscription failed, leaving the whole replication in an unstable state.
>>
>> I can't do anything as node 72 is trying to subscribe a unknown set.
>
> Did you ensure that every node had the subscription before you performed the
> merge? If not, then yes, this is hopelessly broken.
Problem arose *before* I could try to merge. Node 72 did not subscribe that set
as it told "unknown set". Seems like every node but 72 knew there was a new set
with tables in.
>
>> Do I have another option that rebuilding the wole replication ? That may be a
>> half day production break at least...
>
> You have to grovel through the events that 72 is attempting to process, yank
> all the latest events (72 will be really broken now), then drop the tables
> that made up the added set from replication. Probably you'll have to drop
> the 72 node. Then rebuild the set, rebuild 72, and only _then_ once
> everyone has the subscription, perform the merge.
Well problem is I may have done something worse. In fact, I can't drop node 72
from node 1. Process hangs.
Seems to me that every node is trying to acces a no-more existing node. And
therefore, I'm afraid I can't execute any slonik configuration command...
I now have lines like :
2008-06-13 16:34:43 CEST ERROR remoteListenThread_72: "select
"_slonturf".registerNodeConnection(1); listen "_slonturf_Event"; " - ERREUR:
le schéma « _slonturf » n'existe pas
2008-06-13 16:37:31 CEST ERROR slon_connectdb: PQconnectdb("dbname=turf
host=code port=5432 user=slony password=poiklmnb") failed - could not connect
to server: Connection timed out
Is the server running on host "code" and accepting
TCP/IP connections on port 5432?
2008-06-13 16:38:02 CEST ERROR remoteListenThread_72: "select
"_slonturf".registerNodeConnection(11); unlisten "_slonturf_Event"; " - ERREUR:
le schéma « _slonturf » n'existe pas
...
Node 71 does not complain about not knowing 72, but it doesn't propagate data
to 11.
So, is there a way to drop any knowing of 72 from every node ?
Regards,
SAS
- Previous message: [Slony1-general] CREATE SET did not reach every node
- Next message: [Slony1-general] How do I drop rule from replicated table?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list