[QFJ-444] Logout logic problem Created: 16/Jun/09  Updated: 22/Nov/12  Resolved: 12/Nov/12

Status: Resolved
Project: QuickFIX/J
Component/s: Engine
Affects Version/s: 1.4.0
Fix Version/s: 1.5.3

Type: Bug Priority: Default
Reporter: Stephan Assignee: Christoph John
Resolution: Fixed Votes: 6
Labels: None

Issue Links:
Duplicate
is duplicated by QFJ-617 disconnect() might not clear logoutSe... Closed
Relates
is related to QFJ-718 Logout message sent after event "rese... Closed

 Description   

The problem happens when using QuickFIX as an Acceptor.

Symptoms:
A user logs in, waits 30 seconds and logout out.
One minute after, the same user logs in and 2 seconds after the login, QuickFix initiates a logout.

Why does QuickFIX initiate a log out? I've traced the problem to a Boolean in SessionState.logoutReceived that doesn't get reset to false.

Let me explain exactly what happened:

When the user initiates the logout (the fisrt time), at one time the method Session.nextLogout(Message logout) is executing.
There you do state.setLogoutReceived(true); and call disconnect() that will set state.setLogoutSent(false);
All this happens in the "QF/J Session dispatcher" thread.

In my case, in the Session.nextLogout(Message logout) method, after we have generated the logout message and before we set state.setLogoutReceived(true) , the client closes the connection. Because of this, the "SocketAcceptorIoProcessor-1.0" thread calls Session.disconnect() from AbstractIoHandler.sessionClosed(), executes the disconnect() method fully and returns. Then the "QF/J Session dispatcher" thread continues to execute and sets state.setLogoutReceived(true). When the "QF/J Session dispatcher" thread calls disconnect, the responder has been set to null and the logic ends there. Therefore the thread never reached the code that sets state.setLogoutSent(false)

The conclusion is that the Boolean logoutSent remains true in the SessionState and because of this, at the next login, the session will log out automatically after 2 seconds.

To recap, the main problem is that the state machine that is the Session is not well controlled and because of the interaction of several threads, some unforeseen things can happen. This is not a synchronization problem but rather a logical problem.

As I'm in a hurry, I've just synchronized all the methods of the Session so that only one thread at a time can access it and my problems went away, but there must be a better solution.

A last word, there was another circumstance when things went wrong at the end, so the example above is just a symptom of a larger problem.



 Comments   
Comment by Horia Muntean [ 09/Feb/10 ]

Yes, I have seen this issue as well.

QuickFIX initiates a logout after 2 seconds because the default value of LogoutTimeout is 2 seconds ( there is also a log entry: 'Timed out waiting for logout response' in the acceptor session log ) and of course because the session state is broken.

Agree that as long as there are multiple threads involved, synchronizing Session will not really solve the issue.

Maybe this issue deserves a higher priority as well?

Comment by Eric Deshayes [ 29/Apr/11 ]

The session state is based on multiple boolean fields:

  • Session.enabled
  • SessionState.logonSent
  • SessionState.logonReceived
  • SessionState.logoutSent
  • SessionState.logoutReceived

If we want to fix the issue properly, we would have to introduce a state machine and the transition between state could be done atomically.
The introduction of a State machine had already been discussed by QFJ developers some time ago (at least Steve, David and Laurent).
I do not see any easy fix without any small code change so this issue is deferred to a future release.

Eric

Comment by James Olsen [ 05/Jun/11 ]

I believe the same symptoms can occur for an Initiator when both ends are scheduled for disconnect at the same time. The QFJ Initiator will send a Logout which the other party may not respond to before disconnecting if it has already commenced it's disconnect process. After the QFJ Initiator reconnects it is still expecting the Logout response, does not receive it so disconnects again.

Comment by Mate Varga [ 24/Oct/11 ]

Same here - left the engine running for the weekend and on Monday morning the initiator cannot log in.

Comment by Mate Varga [ 18/Nov/11 ]

Guys, this issue practically renders quickfix unusable for production environments. If the client logs out, it simply cannot log in again without restarting the process. If you could at least provide a workaround for that...

Comment by Christoph John [ 12/Oct/12 ]

I think we all agree that the session state handling should be overhauled for 1.6.0.
For the time being, should we just make sure that the method Session.disconnect() assures that the flags

state.setLogoutSent(false);
state.setLogoutReceived(false);
state.setResetReceived(false);
state.setResetSent(false);

get set in a finally block so that the mentioned problem (flag logoutSent does not get set to false since disconnect() returns earlier) goes away?

Comment by Grant Birchmeier [ 16/Oct/12 ]

That might be a good stopgap solution. I have no problem with it.

Comment by Christoph John [ 12/Nov/12 ]

After looking at the code it turned out that there was a correction to this problem in 1.5.1 already. The flags logoutReceived and logoutSent get reset in the nextLogon() method.

However, it probably does not hurt the set the correct state in method disconnect(). That way the state is correct directly after the logout/disconnection.

Comment by Christoph John [ 12/Nov/12 ]

Committed as rev #1099.

Generated at Thu Oct 31 23:48:59 UTC 2024 using JIRA 7.5.2#75007-sha1:9f5725bb824792b3230a5d8716f0c13e296a3cae.