| Summary: | RTT is not working as expected when connection isn't established | ||
|---|---|---|---|
| Product: | JacORB | Reporter: | Peter.Nikol |
| Component: | ORB | Assignee: | Gerald Brose <gerald.brose> |
| Status: | NEW --- | ||
| Severity: | critical | CC: | conny.krappatsch, jacorb, robert.bienias, rolfgall, rudolf.visagie, ze_corsaire |
| Priority: | P1 | ||
| Version: | 2.2.2 | ||
| Hardware: | PC | ||
| OS: | All | ||
| Attachments: |
Example Application to show the multiplication effect of the timeouts
patch for correct client timeouts (RTT) short desription of the patch Improved patch (don't use the former one - take this instead). Improved patch for client timeouts (RTT) Improved patch for client timeouts (RTT) [Java source files] Improved patch for client timeouts (RTT) Recent corbaping test software |
||
|
Description
Peter.Nikol
2007-07-04 13:52:34 UTC
Created attachment 315 [details]
Example Application to show the multiplication effect of the timeouts
Using the example application: console 1: java server ping.ior console 2: java PingClient ping.ior console 1: Control-C At the end of each console 2 output line there is a number which gives the latency in ms of one invokation of the operation doSimplePing. In PingClient there are 4 Clients running in parallel calling the Ping Interface. This is as expected while server hosting the Ping Implementation is running. After stop of server application, output changes to something like "retries exceeded ..." and the number is something like N * CONNECTION_TIMOUT * RETRYS. With the given jacorb_properties file and the usage of 4 Threads in the current PingClient Application you can see effectiv latencies of up to 4 * 500 * 5 = 10 000 ms and sometimes more. Created attachment 325 [details]
patch for correct client timeouts (RTT)
Created attachment 326 [details]
short desription of the patch
*** Bug 549 has been marked as a duplicate of this bug. *** Created attachment 340 [details]
Improved patch (don't use the former one - take this instead).
Bug: #787
Thema: Improved patch for client timeouts (RTT)
This patch is an enhancement of Peter Nikol's last patch and includes his latest changes.
Test case:
--------------------------
Description:
We want to send a blocking CORBA call with a timeout.
If the call cannot be completed within the specified time, an org.omg.CORBA.TIMEOUT exception is thrown.
Used CORBA interface:
interface Ping
{
void doSimplePing(in long timeoutMillis);
};
The client calls the method 'doSimplePing(500)'.
The server receives the request and delays the response for 500ms.
Example:
1. doSimplePing(500), request timeout 250ms
--> org.omg.CORBA.TIMEOUT
2. doSimplePing(500), request timeout 700ms
--> no timeout, call returned within 700ms
3. doSimplePing(5000), request timeout 7000ms
--> no timeout, call returned within 7000ms
With this test application it is possible to create timeouts explicitely.
Test procedure 1:
- Start the server application on machine 1
- Copy the ping.ior from the server to the client
- Start the client application on machine 2
--> Test behaviour is as in example above described
- Unplug the network cable
--> All CORBA requests lead to an org.omg.CORBA.TIMEOUT exception or
to an org.omg.CORBA.COMM_FAILURE exception.
- Plug in the network cable
--> After catching an org.omg.CORBA.COMM_FAILURE exception the application
reinitializes the connection rereading the IOR file.
Test procedure 2:
- Start the server application on machine 1
- Copy the ping.ior from the server to the client
- Start the client application on machine 2
--> Test behaviour is as in example above described
- Unplug the network cable
--> All CORBA requests lead to an org.omg.CORBA.TIMEOUT exception or
to an org.omg.CORBA.COMM_FAILURE exception.
- Restart the server application (a new port is used)
- Copy the ping.ior from the server to the client
- Plug in the network cable
--> After catching an org.omg.CORBA.COMM_FAILURE exception the application
reinitializes the connection rereading the IOR file.
Solved problems:
----------------
- Timeout problem as Peter Nikol described in bug #787
- If the request timeout was smaller than the connection timeout,
an org.omg.CORBA.TIMEOUT exception was thrown.
If the server was restartet meanwhile and the server port number has changed,
the application never gets an org.omg.CORBA.COMM_FAILURE exception,
if the timeout value is small.
While the Executor detects the org.omg.CORBA.COMM_FAILURE the application
receives only a org.omg.CORBA.TIMEOUT, which is not an indicator for an
defective connection.
There is no trigger to reread the IOR file and create a new connection.
---> Therefor a connection with an org.omg.CORBA.COMM_FAILURE exception
is marked as invalid.
If the client wants to send a CORBA call through the same connection,
an org.omg.CORBA.COMM_FAILURE exception
(instead of org.omg.CORBA.TIMEOUT) is thrown immediately.
- Sporadic deadlocks after unplugging the network cable
(Patch Marc Heide, bug #708)
- While the connection is invalid,
it will not be returned on request to a client thread.
In this case the ClientConnectionManager will create a new connection with
the same profile and returns it.
Modified classes:
- ClientConnection
- ClientConnectionManager
- ClientGIOPConnection
- GIOPConnection
- ClientIIOPConnection
- Delegate
The changes are based on JacORB release 2.3.0
Description of the patch:
-------------------------
- ClientConnection:
Method 'isConnectionInvalid()' has been added to check,
if the connection is valid.
- ClientConnectionManager:
1. The method 'getConnection(org.omg.ETF.Profile profile)' checks,
if a VALID connection with the specified profile already exists
in the connections pool.
Otherwise a new connection is created and returned.
2. The method 'releaseConnection(ClientConnection connection)' checks,
if the connection can be closed and removed from the connections pool.
It can be closed, if there is no client using this connection
[connection.decClients()]
or the connection is invalid [connection.isConnectionInvalid()].
In case of having an invalid connection it is removed from
the connections pool.
- ClientGIOPConnection:
In method 'closeAllowReopen()' getting the write lock has been moved before
the 'synchronized(connect_sync)' statement to avoid deadlocks.
(Bug #759, Richard Ridgeway)
- GIOPConnection:
1. The write lock implementation [getWriteLock() and releaseWriteLock()]
has been changed, because we assume that holding a write lock and
requesting it for a second time in the same thread can lead to deadlocks.
2. If an 'org.omg.CORBA.COMM_FAILURE' exception occured,
the connection is marked as invalid.
The next time this connection is used an 'org.omg.CORBA.COMM_FAILURE'
exception is thrown immediately.
3. Closing the connection in the methods 'getMessage()' and
'sendMessage(MessageOutputStream out)' after an error occured
has been removed according to the patch from Marc Heide (Bug #708).
- ClientIIOPConnection:
A 'java.net.ConnectException' in method
'connect(org.omg.ETF.Profile server_profile, long time_out)' is caught as
an 'IOException' and is rethrown as an 'org.omg.CORBA.COMM_FAILURE' exception.
- Delegate:
1. The 'bind_sync' object has been replaced by a 'ReentrantLock' object from
the concurrent package.
Using this object it is possible to set a blocking timeout.
(Enhanced patch from Peter Nikol)
2. Sending CORBA messages [method 'invoke_internal(...)'] has been uncoupled
from the application thread.
Therefor an 'Executor' object is used,
which sends the messages in an own thread.
For each Delegate object one sending thread (Executor) is used.
We hope, that our bugfixes are helpful for the JacORB community and will find the way into the next major release.
We did a careful test of the software bugfixes. Please understand, that the fixes come without any guarantee.
Created attachment 359 [details] Improved patch for client timeouts (RTT) This is the attachment (diff files) for changes described in comment 8 Created attachment 360 [details] Improved patch for client timeouts (RTT) [Java source files] This is the attachment (Java source files) for changes described in comment 8 Created attachment 382 [details]
Improved patch for client timeouts (RTT)
Improved patch for client timeouts (RTT).
A thread pool is used for CORBA timeout calls. [Text diff files]
[Emailed submitter on 11/11/11 requesting they test CVS head and whether their changes are still necessary (given OCI's NIO addition) or is a subset of the changes required] (In reply to comment #12) > [Emailed submitter on 11/11/11 requesting they test CVS head and whether their > changes are still necessary (given OCI's NIO addition) or is a subset of the > changes required] > Hallo Nick, sorry for the delayed answer. We are interested in the fixes of this bugfix and will do a test of the current head. Due to heavy project load, we will not be able to start immediatelly. Our intension is to have first test results until end of november. Thanks Peter (In reply to comment #12) > [Emailed submitter on 11/11/11 requesting they test CVS head and whether their > changes are still necessary (given OCI's NIO addition) or is a subset of the > changes required] > Hallo Nick, sorry for the delayed answer. We are interested in the fixes of this bugfix and will do a test of the current head. Due to heavy project load, we will not be able to start immediatelly. Our intension is to have first test results until end of november. Thanks Peter Have you made any progress testing the new version? Created attachment 394 [details]
Recent corbaping test software
Inside this zip file there is a Readme.txt that you should follow.
We are currently experiencing timeout problems using JacORB 3.0. Can you confirm whether this bug has been addressed or not? This is just so I don't waste a lot of time trying to determine if our issue exactly matches this one. I have not rerun the tests on the current 3.6.1 version. It would be useful to verify if it is still an issue. @Rudolf Visagie : if you have a reproducable test case that would be helpful (In reply to Nick Cross from comment #18) > I have not rerun the tests on the current 3.6.1 version. It would be useful > to verify if it is still an issue. > @Rudolf Visagie : if you have a reproducable test case that would be helpful Unfortunately we are having the problems in our production environment under special conditions and therefore do not have a reproducable test case. I'm not even sure that it matches this bug. Does is this bug affect the request reply timeout after a connection already has been established (jacorb.connection.client.pending_reply_timeout) or only timeouts when establishing a connection. If it's only when establishing connections, it's definitely not related to our problem. @Rudolf : I believe Peter has put a description of the problem in the ticket. If you feel it doesn't meet your criteria please enter a separate ticket. |