Details
Description
This is a continuation of QFJ-215.
I'm now seeing a new hang that happens in IoSessionResponder.disconnect() on the waitForScheduleMessagesToBeWritten() line.
Seems that one of the connecting sockets gets the underlying MINA framework in a "bad state" and the IoSessionResonder.waitForScheduleMessagesToBeWritten() function never gets out of the loop that's waiting for the messages to be written.
I have 2 acceptor sessions running with MINA 1.1.0, and both end up in this state after a few days of being up.
See the attached logs for the full stack dump, and the output of netstat command (on Ubuntu Linux).
The relevant portion of the stack trace is:
"SocketAcceptorIoProcessor-0.0" prio=1 tid=0x082427c0 nid=0x546e sleeping[0xb0c71000..0xb0c71f60]
at java.lang.Thread.sleep(Native Method)
at quickfix.mina.IoSessionResponder.waitForScheduleMessagesToBeWritten(IoSessionResponder.java:55)
at quickfix.mina.IoSessionResponder.disconnect(IoSessionResponder.java:43)
at quickfix.Session.disconnect(Session.java:1370)
at quickfix.mina.AbstractIoHandler.exceptionCaught(AbstractIoHandler.java:82)
Seems that the waitForScheduledMessagesToBeWritten() call that was added in rev 698 (http://quickfixj.svn.sourceforge.net/viewvc/quickfixj/trunk/core/src/main/java/quickfix/mina/IoSessionResponder.java?view=diff&r1=697&r2=698) is the crux.
I'm now seeing a new hang that happens in IoSessionResponder.disconnect() on the waitForScheduleMessagesToBeWritten() line.
Seems that one of the connecting sockets gets the underlying MINA framework in a "bad state" and the IoSessionResonder.waitForScheduleMessagesToBeWritten() function never gets out of the loop that's waiting for the messages to be written.
I have 2 acceptor sessions running with MINA 1.1.0, and both end up in this state after a few days of being up.
See the attached logs for the full stack dump, and the output of netstat command (on Ubuntu Linux).
The relevant portion of the stack trace is:
"SocketAcceptorIoProcessor-0.0" prio=1 tid=0x082427c0 nid=0x546e sleeping[0xb0c71000..0xb0c71f60]
at java.lang.Thread.sleep(Native Method)
at quickfix.mina.IoSessionResponder.waitForScheduleMessagesToBeWritten(IoSessionResponder.java:55)
at quickfix.mina.IoSessionResponder.disconnect(IoSessionResponder.java:43)
at quickfix.Session.disconnect(Session.java:1370)
at quickfix.mina.AbstractIoHandler.exceptionCaught(AbstractIoHandler.java:82)
Seems that the waitForScheduledMessagesToBeWritten() call that was added in rev 698 (http://quickfixj.svn.sourceforge.net/viewvc/quickfixj/trunk/core/src/main/java/quickfix/mina/IoSessionResponder.java?view=diff&r1=697&r2=698) is the crux.
Attachments
Issue Links
| This issue relates to: | ||||
| QFJ-215 | QFJ deadlocks in Session.disconnect() code when a Windows client disconnects from Linux or Mac server |
|
|
|
Output of 'netstat an|grep 7001" (our acceptor socket) command on Linux.
The main 7001 socket is established, but all the incoming connections are hung.