1

I am using HttpClient to send a request a server which is supposed to return xml data. This data is returned as chunked data. I am then trying to write the received xml data to a file. The code I use is shown below:

    HttpEntity entity = response.getEntity();
    InputStream instream = entity.getContent();

        try {
            // do something useful
            InputStreamReader isr = new InputStreamReader(instream);
            FileWriter pw;

            pw = new FileWriter(filename, append);

            OutputStreamWriter outWriter = new OutputStreamWriter(new FileOutputStream(filename, append), "UTF-8");
            BufferedReader rd = new BufferedReader(isr);

            String line = "";
            while ((line = rd.readLine()) != null) {
                // pw.write(line);
                outWriter.write(line);
            }
            isr.close();
            pw.close();
        } finally {
            instream.close();
        }

This results in data that looks as follows to be printed to the file: resulting file

This code works for non chunked data. How do I properly handle chunked data responses using HttpClient. Any help is greatly appreciated.

kushaldsouza
  • 710
  • 5
  • 12
  • 36

1 Answers1

4

I don't think that your problem is the chunking of data.

XML data is plain text data - chunking it means that it is split into several parts that are transfered after another. Therefore each chunk should contain visible plain text xml data which is obviously not the case as shown in the data picture.

May be the content is encoded compressed via gzip or it is not plain text XML but binary encoded XML (e.g. like WBXML).

What concrete type you have you can see from the sent server response headers, especially the used mime type it contains.

Robert
  • 39,162
  • 17
  • 99
  • 152
  • Receiving response: HTTP/1.1 200 OK HTTP/1.1 200 OK Date: Wed, 22 Aug 2012 08:38:16 GMT Server: Apache/2.2.22 (Fedora) Content-Encoding: gzip Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Transfer-Encoding: chunked Content-Type: text/xml – kushaldsouza Aug 22 '12 at 08:51
  • I've been playing around with the code and noticed that the issue was caused by the following line : httpget.setHeader("accept-encoding", "gzip,deflate,sdch"); Removing this, results in the production of a file with proper responses – kushaldsouza Aug 22 '12 at 09:57
  • I am not entirely sure why removing the above mentioned header makes the code work. Any insight would be very helpful. @Robert Thank you for your swift reply yesterday, btw! – kushaldsouza Aug 22 '12 at 14:18
  • 2
    If a client sends the accept-encoding header to a server as shown above it states that it accepts content compressed using the methods gzip, deflate and sdch. Content compression is optional and therefore if you remove the header in your request you are getting uncompressed content. – Robert Aug 23 '12 at 09:44
  • thank you once again for your reply. I am still unsure as to why this does not work. In the reponse headers that I pasted above, the content encoding is said to be gzip, so technically does this not mean that if I as a client set the accept-encoding to gzip, and the content encoding of the response is also gzip, there should not be any issue ? Maybe there is something I don't understand. – kushaldsouza Aug 23 '12 at 14:52
  • It not an issue as long as you decode the retrieved content appropriately. Web browsers do so but the httpclient you you does not. Therefore you are responsible for decompressing the content. – Robert Aug 23 '12 at 14:56