6

Some weird situation. During a performance test of a web application (war), the client gets 500 on about 2% of the requests. When searching the application log, there are some exceptions there:

 java.io.IOException: Stream closed
    at org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:394)
    at org.apache.catalina.connector.CoyoteReader.read(CoyoteReader.java:110)
    at org.codehaus.jackson.impl.ReaderBasedParserBase.loadMore(ReaderBasedParserBase.java:95)
    at org.codehaus.jackson.impl.ReaderBasedParser._skipWSOrEnd(ReaderBasedParser.java:935)
    at org.codehaus.jackson.impl.ReaderBasedParser.nextToken(ReaderBasedParser.java:245)
    at org.codehaus.jackson.map.ObjectMapper._initForReading(ObjectMapper.java:2432)
    at org.codehaus.jackson.map.ObjectMapper._readMapAndClose(ObjectMapper.java:2389)
    at org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1618)
    at MyServlet.doPost(MyServlet.java:25)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:641)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:304)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
    at MyFilter.doFilter(MyFilter.java:67)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
    at MyFilter.doFilter(MyFilter.java:28)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100)
    at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:929)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:405)
    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:964)
    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:515)
    at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:304)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

However, the localhost_access is empty and all requests there have 200 status code.

I'll appreciate any insights on this problem:

  • How can happen that localhost_access log is missing 500 requests, while client is getting them. When at all does this happen?
  • Why java.io.IOException is thrown only occasionally (less then 2% of requests)?

Update 1: The problem occurs with Tomcat 7.0.22 and Java 6

Update 2: Here's some sample request to a static html file (not even to a servlet, so there is no exception in log and nothing in localhost_access):

Response headers:
HTTP/1.1 500 Internal Server Error
Server: Apache-Coyote/1.1
Content-Type: text/html;charset=utf-8
Transfer-Encoding: chunked
Date: Mon, 02 Apr 2012 07:35:06 GMT
Connection: close

Update 3: to eliminate the network I ran the same tests on the localhost. It didn't affect the test results. So it isn't the network.

Update 4: Restarting Tomcat helped. And still I don't understand what caused the original problem.

Update 5: Tomcat upgraded to 7.0.26, but it didn't solve the problem.

Update 6: Moved to Jetty. The errors are gone. Should I say "Good Bye" to Tomcat after all these years?

Tarlog
  • 10,024
  • 2
  • 43
  • 67
  • Is the stack trace quoted above the full stack trace for one of these errors? If not, could you provide the full stack trace please. – Mark Thomas Apr 02 '12 at 19:39
  • Added full stacktrace. I replaced our custom classes with MyServlet and MyFilter. – Tarlog Apr 03 '12 at 07:05
  • Any chance your filter is retaining a reference to the Request and using a request object associated with a previous request rather than the current one? – Mark Thomas Apr 05 '12 at 18:41
  • No. All this flow occurs on less then 2% of requests during the performance tests. Usually everything works fine... But once started it continues to occur even without the load. Like something is suddenly stuck in Tomcat. – Tarlog Apr 07 '12 at 13:22
  • Given the behaviour you described and the recycling I know takes place inside Tomcat between requests then this still looks more like an application issue than a Tomcat one. Given that you appear to be able to repeat this, I'd suggest getting Tomcat plus the app into this state and then examining a request processing cycle with a debugger to see what is happening. In particular, check that the request object is the same throughout the processing chain. – Mark Thomas Apr 09 '12 at 12:31
  • Are you streaming huge amount of JSON data to client for rendering? – Santosh Gokak Apr 09 '12 at 19:21
  • @Mark Thomas: I already said it several times: it happens even with a pure html file. No application logic involved. Not at all. Not even the filters you see in the stacktrace. – Tarlog Apr 10 '12 at 08:36
  • @SantoshGokak no. it's very small. and it also happens with a small html. Tomcat is under a heavy load though... – Tarlog Apr 10 '12 at 08:37
  • 7.0.27 is out - it wouldn't hurt to test with that although I don't recall any significant changes to the HTTP BIO connector code. Attaching a debugger will still provide more information than you have at the moment. As a minimum you should be able to determine what is triggering the 500 status to be set. – Mark Thomas Apr 10 '12 at 19:17
  • Debugging during the performance test is not easy and can change the results... – Tarlog Apr 11 '12 at 06:48
  • Also, I wonder if this thread is related somehow: http://tomcat.10.n6.nabble.com/tomcat-asynchronous-invocation-problem-td4715825.html – Tarlog Apr 11 '12 at 06:48

3 Answers3

6

Error message suggests that IOException comes right when Jackson tries to initialize parsing, meaning that underlying HTTP connection is already closed when starting to parse request. So something has closed the connection; and while it could be client as well as server (HTTP allows both ends to close connections at any point), it is often something in the middle (firewall, proxy, loadbalancer); often due to a timeout of some kind.

StaxMan
  • 113,358
  • 34
  • 211
  • 239
  • Yep, my first guess that it was a loadbalancer. But connecting to the Tomcat directly produced the same error. My feeling still that it's "network related", but I have no idea how to address it. – Tarlog Apr 01 '12 at 18:31
  • Tested on localhost. Same error was reproduced. So it is not network related. It's Tomcat. – Tarlog Apr 03 '12 at 07:09
  • One thing you can do is to just read bit of input first (or all of it), to handle case of not getting any content; and if you get content buffered, only pass that to Jackson. This way you can at least provide better error messages. Of course you also want to figure out why this happens, but it makes sense to also provide better failure messages? – StaxMan Apr 03 '12 at 18:35
  • See "Update 2": same thing happens for a small html file. Not even Jackson. – Tarlog Apr 04 '12 at 06:29
  • Ah. Yeah, was assuming it came from network, but if even file exhibits this... yeah, silly tomcat? – StaxMan Apr 04 '12 at 19:05
5

You don't say what version of Tomcat you are running which would be a big help. My guess - based on what you see - is that it is relatively old.
The access log is implemented as a Valve. If a request is rejected before it gets as far as the Valves it will never appear in the access log. This usually happens when the request is malformed but there are other circumstances that could trigger it.
As of Tomcat 6.0.30 (and all 7.0.x versions) the access logging mechanism was refactored to ensure that messages are written to the access log even if the request is rejected before it reaches the AccessLogValve.

The IOException is thrown when the client closes the connection. Whether that is because of the 500 or is causing the 500 is unclear from the information provided.

Mark Thomas
  • 16,339
  • 1
  • 39
  • 60
  • It's Tomcat 7.0.22 (btw, I put the Tomcat7 tag, but you probably right that I should have write it explicitly) – Tarlog Apr 01 '12 at 17:37
  • Regarding "IOException is thrown when the client closes the connection". How in this case will the client receive 500 from Tomcat? Quite unclear. – Tarlog Apr 01 '12 at 18:13
1

AJAX Post on IE have intermittent failure issues. We have seen sockettimeoutexceptions. You can disable KeepAlive and try, or disable KeepAlive for AJAX POST by using a 'connection:close' header. Tune apache requestreadtimeout for no body request.

Vibhor
  • 11
  • 1