[QFJ-459] java.lang.StackOverflowError when disconnecting file system Created: 20/Jul/09  Updated: 22/Jun/12  Resolved: 20/May/11

Status: Closed
Project: QuickFIX/J
Component/s: Engine
Affects Version/s: 1.3.1, 1.4.0
Fix Version/s: 1.5.1

Type: Bug Priority: Default
Reporter: Sylvestre COZIC Assignee: Unassigned
Resolution: Fixed Votes: 1
Labels: None
Environment:

Mandriva Linux 2008 EDT, kernel 2.6.24.7



 Description   

The following error occurs when disconnecting the file system where log files are written on:

[java] java.lang.StackOverflowError
[java] at java.io.FileOutputStream.writeBytes(Native Method)
[java] at java.io.FileOutputStream.write(FileOutputStream.java:247)
[java] at quickfix.FileLog.writeTimeStamp(FileLog.java:116)
[java] at quickfix.FileLog.writeMessage(FileLog.java:97)
[java] at quickfix.FileLog.onEvent(FileLog.java:111)
[java] at quickfix.LogUtil.logThrowable(LogUtil.java:47)
[java] at quickfix.LogUtil.logThrowable(LogUtil.java:60)
[java] at quickfix.FileLog.writeMessage(FileLog.java:106)
[java] at quickfix.FileLog.onEvent(FileLog.java:111)
[java] at quickfix.LogUtil.logThrowable(LogUtil.java:47)
[java] at quickfix.LogUtil.logThrowable(LogUtil.java:60)
[java] at quickfix.FileLog.writeMessage(FileLog.java:106)
[java] at quickfix.FileLog.onEvent(FileLog.java:111)

Repro steps:

  • Start quickfixj engine using a directory on a removable drive (for instance usb key) as log dir
  • Disconnect removable drive while quickfixj engine is running


 Comments   
Comment by Sylvestre COZIC [ 22/Jul/09 ]

I suggest the following patch :

  • In quickfix.FileLog class, writeMessage(FileOutputStream stream, String message, boolean forceTimestamp) method:

- LogUtil.logThrowable(sessionID, "error writing message to log", e);*

+ e.printStackTrace();*

This should prevent StackOverflow error when an exception is raised while logging an exception.

I would also suggest:

  • In quickfix.Session class isStateRefreshNeeded(String msgType) method :

    - return refreshMessageStoreAtLogon && !state.isInitiator() && msgType.equals(MsgType.LOGON);

    + return refreshMessageStoreAtLogon && msgType.equals(MsgType.LOGON);

This will make sure MessageStore#refresh() is called on logon, so that in case of filesystem failure, file storing sequence numbers is reopen when trying to reconnect. By the way, I did not understood clearly why this method returns false when ConnectionType property is set to Initiator

Comment by Sylvestre COZIC [ 28/Jan/10 ]

Hello,

Do you need further information to fix this issue?

Best regards,

Comment by Eric Deshayes [ 26/Apr/11 ]

committed on integration branch rev 1018

Comment by Steve Bate [ 02/May/11 ]

It seems to me that the safest and simplest approach to avoiding the StackOverflow is to simply write the exception message, stack trace and session ID to the standard error stream. The modified code delegates the logging to the SLFJ logger but that logger may also be writing to the same removed file system. This will cause another exception which will be caught in the MINA-related code (when logging incoming messages). It's not clear to me if this will be another source of a StackOverflow or not or result in some other undesired behavior.

Comment by Eric Deshayes [ 03/May/11 ]

As discussed with Steve, it is safest to log the error to the standard error stream.
Committed on integration branch rev#1028.

Comment by Mate Varga [ 22/Jun/12 ]

A question related to this 'fix':

(FileLog.java

private void writeMessage(FileOutputStream stream, String message, boolean forceTimestamp) {
try {
if (forceTimestamp || includeTimestampForMessages)

{ writeTimeStamp(stream); }

stream.write(message.getBytes(CharsetSupport.getCharset()));
stream.write('\n');
stream.flush();
if (syncAfterWrite)

{ stream.getFD().sync(); }

} catch (IOException e)

{ //QFJ-459: no point trying to log the error in the file if we had an IOException //we will end up with a java.lang.StackOverflowError System.err.println("error writing message to log : "+message); e.printStackTrace(System.err); }

}

Why do you need to log this error? Why isn't it much-much better to throw an exception or at least call a callback registered by the application? This code can lead to the application not knowing about a possibly critical problem, i.e. qfix not being able to write to the log file stream.

Generated at Mon Apr 29 09:28:14 UTC 2024 using JIRA 7.5.2#75007-sha1:9f5725bb824792b3230a5d8716f0c13e296a3cae.