This bug was introduced in bfa8e601fe7ba1bd91a053901426d4f7195c53a0 (2.1.0) and 60566590d683b85733404ef290e6c1823c4c014c (2.0.5) If a failover command is executed while the slon for the backup node is not running (say node 2) The most ahead node (say node 3) will have a FAILOVER_SET event generated with a ev_origin=1 (the failing node). For the failover to finish that event needs to be processed on node 2. When the slon for node 2 is later started it sees that no_active=false in sl_node (this change was made in the above referenced commits). Since the node is inactive no remoteWorkerThread_1 is started so the slon for node 2 won't ever process the FAILOVER_SET event since that event has ev_origin=1. As a workaround if you get into this situation you can: manually (with psql) set no_active=true for the failed node on node 2. Then start the slon for node 2. It will now have a remoteWorkerThread_1 and process the FAILVOVER_SET command. Longer term we probably need to split out a nodes inactive status for rebuild listen paths and waiting compared with starting slon worker threads?