2

I have tried to write a program that gets a file from web server in a chunked format. I am trying to use ChunkedInputStream class in HTTP 3.0 API. When I run the code, it gives me "chucked input stream ended unexpectedly" error. What am I doing wrong? Here is my code:

    HttpClient client = new DefaultHttpClient();
    HttpGet getRequest = new HttpGet(location);
    HttpResponse response = client.execute(getRequest);
    InputStream in = response.getEntity().getContent();

    ChunkedInputStream cis = new ChunkedInputStream(in);
    FileOutputStream fos = new FileOutputStream(new ile("session_"+sessionID));
    while(cis.read() != -1 )
    {
        fos.write(cis.read());
    }
    in.close();
    cis.close();
    fos.close();
user1480813
  • 35
  • 1
  • 6
  • I was having what may be a similar problem, with fetched pages not completely downloading. I wonder if the PHP CURL library might work better than this? http://www.php.net/manual/en/intro.curl.php – NoBugs Jun 25 '12 at 18:55

2 Answers2

3

Don't use the ChunkedInputStream, as axtavt suggests, but there is another problem. You are skipping every odd numbered byte. If the data is an even number of bytes you will write the -1 that means EOS and then do another read. The correct way to copy a stream:

byte[] buffer = new byte[8192];
int count;
while ((count = in.read(buffer)) > 0)
{
  out.write(buffer, 0, count);
}
user207421
  • 305,947
  • 44
  • 307
  • 483
  • Thanks a lot! I now get the correct sized file! Another question I have is how do I preserve the tags that are sent with each chunk of data? – user1480813 Jun 25 '12 at 22:44
2

Are you sure that you need to use ChunkedInputStream in this case?

I think HttpClient should handle chuncked encoding internally, therefore response.getEntity().getContent() returns already decoded stream.

axtavt
  • 239,438
  • 41
  • 511
  • 482
  • By using just the input stream, now I do get the content of the file. The most important part of the file is the tags that are spread throughout the content. Is there any way that I can preserve the tags between the content lines of the file? If I do wireshark, I see the tags but my downloaded file does not have them. – user1480813 Jun 25 '12 at 20:51
  • @user1480813: You mean you need to get chunk boundaries explicitly? It's a strange requirement, and I don't think it can be done at this level of abstraction. To do it you'll need to somehow intervent into response processing logic of Http Client, or just use sockets directly. – axtavt Jun 26 '12 at 10:30