[QFJ-34] Thread leak in socket connector Created: 09/Jul/06  Updated: 15/Feb/07  Resolved: 15/Jul/06

Status: Closed
Project: QuickFIX/J
Component/s: Networking
Affects Version/s: 1.0.0 Final
Fix Version/s: 1.0.2

Type: Bug Priority: Critical
Reporter: Brad Harvey Assignee: Steve Bate
Resolution: Fixed Votes: 0
Labels: None
Environment:

Originally observed in quickfix/j 1.0.0, jdk 1.4.2, mina 0.9.3.
Also Reproduced with quickfix/j trunk, jdk 1.5, mina 0.9.5 trunk.


Attachments: JPEG File Banzai thread leak.jpg    

 Description   

A few new threads are created by MINA each time an initiator connects to a remote host. Two of these (AnonymousIoService-x-y) are not cleaned up which can event lead to OutOfMemoryError or other strange behaviour.

This can easily be reproduced using Banzai and Executor. Change the SenderCompId in Banzai to BANZAI2 which will cause the executor to disconnect it after the logon. Start the executor and start Banzai in the debugger. As it reconnects you'll see three threads hanging around each time:
SocketConnector-0
AnonymousIoService-1-1
AnonymousIoService-1-2

The SocketConnector threads do eventually get cleaned up after an idle period, but the AnonymousIoService threads hang around forever.

Thread [SocketConnector-1] (Suspended)
WindowsSelectorImpl$SubSelector.poll0(long, int, int[], int[], int[], long) line: not available [native method]
WindowsSelectorImpl$SubSelector.poll() line: 275
WindowsSelectorImpl$SubSelector.access$400(WindowsSelectorImpl$SubSelector) line: 257
WindowsSelectorImpl.doSelect(long) line: 138
WindowsSelectorImpl(SelectorImpl).lockAndDoSelect(long) line: 69
WindowsSelectorImpl(SelectorImpl).select(long) line: 80
SocketConnector$Worker.run() line: 394
Thread [AnonymousIoService-2-3] (Suspended)
Object.wait(long) line: not available [native method]
LeaderFollowersThreadPool$Worker.waitForPromotion() line: 402
LeaderFollowersThreadPool$Worker.run() line: 309
Thread [AnonymousIoService-2-4] (Suspended)
Object.wait(long) line: not available [native method]
BlockingQueue(Object).wait() line: 474
BlockingQueue.waitForNewItem() line: 55
LeaderFollowersThreadPool$Worker.fetchRunnable() line: 339
LeaderFollowersThreadPool$Worker.run() line: 312



 Comments   
Comment by Brad Harvey [ 09/Jul/06 ]

I actually think this is a bug in MINA. The PooledThreadModel which is used by the default configuration creates a new ThreadPoolFilter which is what is creating these wayward threads. I don't think the destroy method of this filter ever gets called. If the PooledThreadModel used a ReferenceCountingIoFilter to wrap the ThreadPoolFilter before adding it to the chain then destroy would be called

I'm also not sure if quickfix should even be using the PooledThreadModel - I think the event handling strategies do this part instead. So you might be able to do something like this in IoSessionInitiator to use no threading:

public synchronized void connect() {
lastReconnectAttemptTime = SystemTime.currentTimeMillis();
try

{ IoConnector ioConnector = ProtocolFactory.createIoConnector(getNextSocketAddress()); BaseIoServiceConfig connectorConfig = (BaseIoServiceConfig) ioConnector.getDefaultConfig().clone(); connectorConfig.setThreadModel(ThreadModel.MANUAL); ConnectFuture connectFuture = ioConnector.connect(getNextSocketAddress(), ioHandler, connectorConfig); connectFuture.join(); ioSession = connectFuture.getSession(); }

catch (Throwable e)

{ quickfixSession.getLog().onEvent("Connection failed: " + e.getMessage()); }

}

It seems to work ok for Banzai stops all the extra thread creation, but I haven't tried it beyond that.

Comment by Brad Harvey [ 09/Jul/06 ]

Slightly off topic:

I think the PooledThreadModel/ThreadPoolFilter would take care of quickfix/j's thread pool needs as described in http://sourceforge.net/mailarchive/forum.php?thread_id=10923994&forum_id=103.

ThreadPoolFilter uses an internal queue (ThreadPoolFilter$SessionBuffer) per MINA session. Each queue can be processed by at most one worker thread from the thread pool at once, so each message would be processed in order. The EventHandlingStrategies would no longer be used as onMessage would be called by one of the worker threads.

The worker thread will process events on the queue until it is empty, and more events can be added while it is running. I think this means a busy Session could starve other sessions if the thread pool is too small so it may not be directly suitable as a SingleThreadedEventHandlingStrategy replacement - an alternate ThreadPoolFilter that only processes one event from the queue may be required. ThreadPerSession could be implemented by supplying an unbounded thread pool.

This all assumes that the ThreadPoolFilter gets cleaned up properly when the session is closed, of course.

Comment by Steve Bate [ 09/Jul/06 ]

Brad, thanks for investigating this issue. I'm traveling internationally right now but I'll review all your comments when I return. I definitely appreciate the help.

Comment by Steve Bate [ 15/Jul/06 ]

I added the code for your original suggestion. The other suggestion about using MINA thread pooling is interesting, but I'd like to wait until the 0.9.x API stabilizes (MINA 1.0 release) before I tie to the QFJ code into it very deeply. The changes are on the branch: QFJ_RELEASE_1_0_x (and merged to the trunk).

Comment by Brad Harvey [ 18/Jul/06 ]

Thanks Steve. Note that my code was a bit naughty in that it used the base class instead of the interface for IoServiceConfig.

Comment by Steve Bate [ 15/Feb/07 ]

I noticed some additional discussion about MINA thread models. It appears that Trustin and Peter want to remove the interface from MINA and recommend using the "manual" thread pool management (which we are doing). Brad (or anyone else), have you been following the discussion and are you familiar with the latest thoughts on th issue?

http://www.nabble.com/Is-ThreadModel-really-useful--t2778892.html

Comment by Brad Harvey [ 15/Feb/07 ]

Thanks for the link, no I hadn't been following it. Sounds like you'll just have to remove the setThreadModel calls and come back a full circle

To implement thread handling I think they're still recommending using a ThreadPoolFilter but adding it manually to the filter chain, as opposed to using the ThreadModel abstraction. So most of the "slightly off topic" stuff about ThreadPoolFilter above should still apply if anyone is after a thread pool in quickfix.

Cheers,
Brad.

Generated at Thu May 02 01:01:38 UTC 2024 using JIRA 7.5.2#75007-sha1:9f5725bb824792b3230a5d8716f0c13e296a3cae.