Niels Breet postgres
Wed Nov 29 13:28:45 PST 2006
I had this problem:
2006-11-29 20:51:41 CET FATAL  slon: sched_wakeuppipe create failed -(24)
Too many open files

When slon is started and the local database is down, slon now tries
to reconnect. Before 1.2 we would bail out, but now we restart the
thread. slon_terminate_worker() doesn't close the currently opened
sched_wakeuppipe, so after a few loops we have too many open files.

I just added this to the end of slon_terminate_worker():
        close(sched_wakeuppipe[0]);
        close(sched_wakeuppipe[1]);

I'm sure there is a better solution, but that solves the problem.

Maybe we also need to sleep a little as slon is retrying *very* rapidly.


- Niels


Here is the log.

2006-11-29 20:51:41 CET DEBUG2 slon_retry() from pid=10220
2006-11-29 20:51:41 CET DEBUG1 slon: retry requested
2006-11-29 20:51:41 CET DEBUG2 slon: notify worker process to shutdown
2006-11-29 20:51:41 CET DEBUG2 slon: worker process created - pid = 10220
2006-11-29 20:51:41 CET DEBUG2 slon: child terminated status: 0; pid:
10220, current worker pid: 10220
2006-11-29 20:51:41 CET DEBUG1 slon: restart of worker
2006-11-29 20:51:41 CET CONFIG main: slon version 1.2.1 starting up
2006-11-29 20:51:41 CET DEBUG2 slon: watchdog process started
2006-11-29 20:51:41 CET DEBUG2 slon: watchdog ready - pid = 9712
2006-11-29 20:51:41 CET FATAL  main: Cannot connect to local database -
could not connect to server: Connection refused
        Is the server running on host "localhost" and accepting
        TCP/IP connections on port 3000?

2006-11-29 20:51:41 CET DEBUG2 slon_retry() from pid=10221
2006-11-29 20:51:41 CET DEBUG1 slon: retry requested
2006-11-29 20:51:41 CET DEBUG2 slon: notify worker process to shutdown
2006-11-29 20:51:41 CET DEBUG2 slon: worker process created - pid = 10221
2006-11-29 20:51:41 CET DEBUG2 slon: child terminated status: 0; pid:
10221, current worker pid: 10221

2006-11-29 20:51:41 CET DEBUG1 slon: restart of worker
2006-11-29 20:51:41 CET CONFIG main: slon version 1.2.1 starting up
2006-11-29 20:51:41 CET DEBUG2 slon: watchdog process started
2006-11-29 20:51:41 CET DEBUG2 slon: watchdog ready - pid = 9712
2006-11-29 20:51:41 CET FATAL  main: Cannot connect to local database -
could not connect to server: Connection refused
        Is the server running on host "localhost" and accepting
        TCP/IP connections on port 3000?

2006-11-29 20:51:41 CET DEBUG2 slon_retry() from pid=10222
2006-11-29 20:51:41 CET DEBUG1 slon: retry requested
2006-11-29 20:51:41 CET DEBUG2 slon: notify worker process to shutdown
2006-11-29 20:51:41 CET DEBUG2 slon: worker process created - pid = 10222
2006-11-29 20:51:41 CET DEBUG2 slon: child terminated status: 0; pid:
10222, current worker pid: 10222
2006-11-29 20:51:41 CET DEBUG1 slon: restart of worker
2006-11-29 20:51:41 CET CONFIG main: slon version 1.2.1 starting up
2006-11-29 20:51:41 CET FATAL  slon: sched_wakeuppipe create failed -(24)
Too many open files
2006-11-29 20:51:41 CET DEBUG2 slon: exit(-1)






More information about the Slony1-general mailing list