[QFJ-444] Logout logic problem Created: 16/Jun/09 Updated: 22/Nov/12 Resolved: 12/Nov/12 |
|
Status: | Resolved |
Project: | QuickFIX/J |
Component/s: | Engine |
Affects Version/s: | 1.4.0 |
Fix Version/s: | 1.5.3 |
Type: | Bug | Priority: | Default |
Reporter: | Stephan | Assignee: | Christoph John |
Resolution: | Fixed | Votes: | 6 |
Labels: | None |
Issue Links: |
|
Description |
The problem happens when using QuickFIX as an Acceptor. Symptoms: Why does QuickFIX initiate a log out? I've traced the problem to a Boolean in SessionState.logoutReceived that doesn't get reset to false. Let me explain exactly what happened: When the user initiates the logout (the fisrt time), at one time the method Session.nextLogout(Message logout) is executing. In my case, in the Session.nextLogout(Message logout) method, after we have generated the logout message and before we set state.setLogoutReceived(true) , the client closes the connection. Because of this, the "SocketAcceptorIoProcessor-1.0" thread calls Session.disconnect() from AbstractIoHandler.sessionClosed(), executes the disconnect() method fully and returns. Then the "QF/J Session dispatcher" thread continues to execute and sets state.setLogoutReceived(true). When the "QF/J Session dispatcher" thread calls disconnect, the responder has been set to null and the logic ends there. Therefore the thread never reached the code that sets state.setLogoutSent(false) The conclusion is that the Boolean logoutSent remains true in the SessionState and because of this, at the next login, the session will log out automatically after 2 seconds. To recap, the main problem is that the state machine that is the Session is not well controlled and because of the interaction of several threads, some unforeseen things can happen. This is not a synchronization problem but rather a logical problem. As I'm in a hurry, I've just synchronized all the methods of the Session so that only one thread at a time can access it and my problems went away, but there must be a better solution. A last word, there was another circumstance when things went wrong at the end, so the example above is just a symptom of a larger problem. |
Comments |
Comment by Horia Muntean [ 09/Feb/10 ] |
Yes, I have seen this issue as well. QuickFIX initiates a logout after 2 seconds because the default value of LogoutTimeout is 2 seconds ( there is also a log entry: 'Timed out waiting for logout response' in the acceptor session log ) and of course because the session state is broken. Agree that as long as there are multiple threads involved, synchronizing Session will not really solve the issue. Maybe this issue deserves a higher priority as well? |
Comment by Eric Deshayes [ 29/Apr/11 ] |
The session state is based on multiple boolean fields:
If we want to fix the issue properly, we would have to introduce a state machine and the transition between state could be done atomically. Eric |
Comment by James Olsen [ 05/Jun/11 ] |
I believe the same symptoms can occur for an Initiator when both ends are scheduled for disconnect at the same time. The QFJ Initiator will send a Logout which the other party may not respond to before disconnecting if it has already commenced it's disconnect process. After the QFJ Initiator reconnects it is still expecting the Logout response, does not receive it so disconnects again. |
Comment by Mate Varga [ 24/Oct/11 ] |
Same here - left the engine running for the weekend and on Monday morning the initiator cannot log in. |
Comment by Mate Varga [ 18/Nov/11 ] |
Guys, this issue practically renders quickfix unusable for production environments. If the client logs out, it simply cannot log in again without restarting the process. If you could at least provide a workaround for that... |
Comment by Christoph John [ 12/Oct/12 ] |
I think we all agree that the session state handling should be overhauled for 1.6.0. state.setLogoutSent(false); get set in a finally block so that the mentioned problem (flag logoutSent does not get set to false since disconnect() returns earlier) goes away? |
Comment by Grant Birchmeier [ 16/Oct/12 ] |
That might be a good stopgap solution. I have no problem with it. |
Comment by Christoph John [ 12/Nov/12 ] |
After looking at the code it turned out that there was a correction to this problem in 1.5.1 already. The flags logoutReceived and logoutSent get reset in the nextLogon() method. However, it probably does not hurt the set the correct state in method disconnect(). That way the state is correct directly after the logout/disconnection. |
Comment by Christoph John [ 12/Nov/12 ] |
Committed as rev #1099. |