[QFJ-788] StackOverflowError still an issue when processing larger queues Created: 06/Jun/14  Updated: 02/Apr/15  Resolved: 06/Feb/15

Status: Closed
Project: QuickFIX/J
Component/s: Engine
Affects Version/s: 1.5.3
Fix Version/s: 1.6.0

Type: Bug Priority: Major
Reporter: Andrzej Hajderek Assignee: Christoph John
Resolution: Fixed Votes: 0
Labels: None

Attachments: Text File SessionTest.java.patch    

 Description   

Hi,

QuickFIX/J fails regularly with the StackOverfloError when processing large queues (around 1000 messages or more - depending on Java stack size).

A typical scenario:

  • Client sends a ResendRequest to server
  • Server is slow in delivering old messages, but keeps sending real-time messages quickly
  • The queue builds up quickly
  • After the last old message is resent by server the client starts processing the queue using recursion and it fails with the StackOverflowError.

This is reproducible with trunk revision #1187. Test code attached.

Regards,
Andrzej Hajderek



 Comments   
Comment by Christoph John [ 06/Jun/14 ]

Hi Andrzej, thanks for the good test, really appreciated. Saves a lot of time.

Comment by Andrzej Hajderek [ 06/Jun/14 ]

Please note that for smaller queues the test passes correctly. Only larger queue sizes cause problems.

The same issue can be reproduced in a real-world scenario, when a large backlog of messages (say 100k messages) is present on the server. The client is in the process of retrieving the entire backlog with a relatively small chunk size (say 500) and it is not very quick at processing the responses. It takes certain amount of time to process the backlog (say 15 minutes). During that time the server keeps sending new real-time trades to the client. These new real-time trades get queued. After the processing of the backlog is completed QuickFIX/J blows up with the StackOverflowError.

The test code I attached represents the minimal scenario for reproducing the issue effectively.

Comment by Christoph John [ 25/Dec/14 ]

Hi Andrzej, I have introduced a flag into the next(Message) method whether a queue is currently processed and if that is the case, the nextQueued() method is not entered. That way, the nextQueued(Message, String) method will not recursively be called.

# This patch file was generated by NetBeans IDE
# It uses platform neutral UTF-8 encoding and \n newlines.
--- HEAD
+++ Modified In Working Tree
@@ -875,10 +875,7 @@
         return state.getMessageStore();
     }
 
-    /**
-     * (Internal use only)
-     */
-    public void next(Message message) throws FieldNotFound, RejectLogon, IncorrectDataFormat,
+    private void next(Message message, boolean isProcessingQueuedMessages) throws FieldNotFound, RejectLogon, IncorrectDataFormat,
             IncorrectTagValue, UnsupportedMessageType, IOException, InvalidMessage {
 
         if (message == EventHandlingStrategy.END_OF_STREAM) {
@@ -1094,12 +1091,24 @@
             }
         }
 
+        // QFJ-788: prevent StackOverflow on large queue
+        if (!isProcessingQueuedMessages) {
         nextQueued();
         if (isLoggedOn()) {
             next();
         }
     }
+    }
 
+    /**
+     * (Internal use only)
+     */
+    public void next(Message message) throws FieldNotFound, RejectLogon, IncorrectDataFormat,
+            IncorrectTagValue, UnsupportedMessageType, IOException, InvalidMessage {
+
+        next(message, false);
+    }
+
     private boolean resetOrDisconnectIfRequired(Message msg) {
         if (!resetOnError && !disconnectOnError) {
             return false;
@@ -2256,7 +2265,7 @@
     private void nextQueued(Message msg, String msgType) throws InvalidMessage, FieldNotFound, RejectLogon,
             IncorrectDataFormat, IncorrectTagValue, UnsupportedMessageType, IOException {
         try {
-            next(msg);
+            next(msg, true);
         } catch (final InvalidMessage e) {
             final String message = "Invalid message: " + e;
             if (MsgType.LOGON.equals(msgType)) {

Do you think that this solution is sufficient?
Thanks,
Chris.

Generated at Thu May 02 00:08:56 UTC 2024 using JIRA 7.5.2#75007-sha1:9f5725bb824792b3230a5d8716f0c13e296a3cae.