3

I am hitting an endpoint that returns roughly 900k of XML. Every now and again (less than 1 in 5000 on today's tests) I get a MalformedChunkCodingException

This is happening from a fairly old webapp (~10 years), built on Spring 3. I switched to using RestTemplate instead of httpclient directly, but that hasn't fixed it. After running for some hours today with wire level logging enabled on httpclient I've managed to capture one.

Caused by: org.apache.http.MalformedChunkCodingException: Unexpected content at the end of chunk
    at org.apache.http.impl.io.ChunkedInputStream.getChunkSize(ChunkedInputStream.java:259)
    at org.apache.http.impl.io.ChunkedInputStream.nextChunk(ChunkedInputStream.java:227)
    at org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:186)
    at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:137)
    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.Reader.read(Reader.java:140)
    at org.springframework.util.StreamUtils.copyToString(StreamUtils.java:74)
    at org.springframework.http.converter.StringHttpMessageConverter.readInternal(StringHttpMessageConverter.java:85)
    at org.springframework.http.converter.StringHttpMessageConverter.readInternal(StringHttpMessageConverter.java:40)
    at org.springframework.http.converter.AbstractHttpMessageConverter.read(AbstractHttpMessageConverter.java:153)
    at org.springframework.web.client.HttpMessageConverterExtractor.extractData(HttpMessageConverterExtractor.java:103)
    at org.springframework.web.client.RestTemplate$ResponseEntityResponseExtractor.extractData(RestTemplate.java:724)
    at org.springframework.web.client.RestTemplate$ResponseEntityResponseExtractor.extractData(RestTemplate.java:709)

Normally the log seems to go like this:

DEBUG org.apache.http.wire - << "words words words"
DEBUG org.apache.http.wire - << "[\r][\n]"
DEBUG org.apache.http.wire - << "FAF[\r][\n]"
DEBUG org.apache.http.wire - << "words words words up to FAF bytes" 
DEBUG org.apache.http.wire - << "[\r][\n]"
DEBUG org.apache.http.wire - << "BAA[\r][\n]"
DEBUG org.apache.http.wire - << "words words words up to BAA bytes"

but in the one that went wrong I have this:

DEBUG org.apache.http.wire - << "words words words"
DEBUG org.apache.http.wire - << "[\r][\n]"
DEBUG org.apache.http.wire - << "B50[\r][\n]"
DEBUG org.apache.http.wire - << "words words words up to B50 bytes"
DEBUG org.apache.http.wire - << "3FC0[\r][\n]"

it's missing the [\r][\n] at the end of the B50 chunk.

So, assuming I can trust org.apache.http.wire debugging, then my application is receiving the stream like that, and it's malformed. Is it possible that the haproxy between them is corrupting the stream?

Fortunately, or not, the other end of the conversation is also one of my applications, running in tomcat with Spring 4.2.4.

Where do I begin looking for who is building that invalid response? Spring? Tomcat?

It looks like I can disable chunking, but only my calculating my content-length as I build the response, which I'm not superkeen to do, because then I'll have to serialize my responses manually, rather than letting Spring do it.

Dave Thorn
  • 61
  • 1
  • 6

2 Answers2

0

As you said the application is pretty old. You may need to update all libraries versions hoping that issue was already detected and fixed somewhere.

But finding the bad actor would help, as you would only have to fix that one.

From my own experience at testing bad syntax support in HTTP tools I'm pretty sure Haproxy is the more robust element in the elements you listed. But that does not exclude an issue there.

Every HTTP actor between your sending and receiving endpoints can alter the HTTP body (rework the chunks size), so you need to catch the input and output of all actors (spring, tomcat, haproxy, any other proxy and/or reverse proxy, load balancer, ssl terminator) to detect the bad chunk. I would start at the message emitter, the XML endpoint. And I would use wireshark/pcap/httpdump, something really capturing the TCP and HTTP traffic. But you'll maybe have to find a way of throwing away captures quite fast until you reach the failing point, as 1/5000 means you have a big risk of capturing huage amounts of data.

regilero
  • 29,806
  • 6
  • 60
  • 99
0

I have spent the best part of a month running tests against this. While I lack a 100% certain answer, I have found that:

When running with haproxy version 1.7.x I am unable to trigger the error.

When running with haproxy version 2.0.x I can occasionally trigger the error.

It may or may not relate to this:

https://github.com/haproxy/haproxy/issues/171

Dave Thorn
  • 61
  • 1
  • 6