Currently, if for instance, a server is killed the client receives a COMM_FAILURE then the transport is closed down and the GIOP layer is kept 'open'. However, the problem is that with this scenario we get a ClientMessageReceptor thread stuck in receive messages loop. Of course, if the client object is _release'd this would go away. If the GIOP connection is closed in ClientGIOPConnection::closeAllowReopen i.e. close() is called (after being called from streamClosed) then the thread is released into the pool and the transport and giop layer are closed down. However, this does not negate the possibility of calling the client object again - it should simply attempt to call the server again, reopening the connection/transport if required. There appears to be a trade-off here between object creation/gc versus threads. If the client does not call release, it can cause thread pool exhaustion as many threads are simply locked up waiting.
Fixed on master by 0b9f406a288dcba2fe3cd4a6bcadcf8c412a9559 and branch by 8ce56ede8b404bf6c37e21fa31c4ac55c9976853
Created attachment 456 [details] example of failure raised by this bug fix The attached patch modifies the hello demo to illustrate a negative side effect to the current fix for this bug. Given a server listening on a specific endpoint, a client will hang indefinitely if the server dies and is restarted while the client is still running. It seems that when "disconnect_after_systemexception" is true, the thread in GIOPConnection.receiveMessages exits but later after a new connection is established no thread reenters receiveMessages so the server's reply message is never received and the client is hung. My test has to be run by hand, use separate command lines for client and server. cd demo/hello jaco -cp target/classes org.jacorb.demo.hello.Server hello.ior and cd demo/hello jaco -cp target/classes org.jacorb.demo.hello.Client hello.ior While the client is sleeping, kill and restart the server. Observe the client hangs after sending the request. Note that uncommenting Client.java line 43: // p.setProperty ("jacorb.connection.client.disconnect_after_systemexception", "false"); allows the client to recover.
reopening as there is still an issue here.
Discussing with Phil - might need a patch in ClientConnectionManager
Fix 932cf57c941d6794e180fd69272fbfc46b38d5d9 applied.