0

I'm writing a program to download all of my monthly statements from my ISP using HttpClient. I can login to the site, access pages, and download pages but I can't download my PDF statements. It just downloads some HTML. I used the answer to this question to start with. Here is my method where I'm trying to download the PDF:

public void downloadPdf() throws ClientProtocolException, IOException  {
HttpGet httpget = new HttpGet("https://www.cox.com/ibill/PdfBillingStatement.stmt?account13=123&stmtCode=001&cycleDate=7/21/2014&redirectURL=error.cox");
    HttpResponse response = client.execute(httpget);

    System.out.println("Download response: " + response.getStatusLine());

    HttpEntity entity = response.getEntity();

    InputStream inputStream = null;
    OutputStream outputStream = null;

    if (entity != null) {
        long len = entity.getContentLength();
        inputStream = entity.getContent();

        outputStream = new FileOutputStream(new File("/home/bkurczynski/Desktop/statement.pdf"));

        int read = 0;
        byte[] bytes = new byte[1024];

        while ((read = inputStream.read(bytes)) != -1) {
            outputStream.write(bytes, 0, read);
        }

        outputStream.close();
    }
}

Any help would be greatly appreciated. Thank you!

Community
  • 1
  • 1
kurczynski
  • 359
  • 9
  • 17
  • seems that your application was not authorized properly, and the server returned HTML text with `not authorzed` error or another diagnostic message. Also - where do you close the output stream ? –  Aug 17 '14 at 01:02
  • Did you try that link inside a browser when you're not logged in? You get an HTML page as well. This is why your java code is seeing HTML.. – Martin Konecny Aug 17 '14 at 01:03
  • did you check if you are accessing the page you really need ? User is not logged in. is what I get – Yehia Awad Aug 17 '14 at 01:04
  • @RafaelOsipov that is what it seems like, but I already logged in with another method successfully. How do I use the same session that I've already logged in with? Looks like I haven't, been rushing through this. – kurczynski Aug 17 '14 at 01:58
  • @kurczynski session? do you aquire a reference to the session and where fo you pass the created session to this method? Is it even possible for your environment? I think your server recognizes new attempt to connect as a new connection, which requires authorization. And somehow your client process does not pass this stage. What HTML did you get? What is inside? –  Aug 17 '14 at 07:59
  • @rafaelosipov I'm assuming it's a session, I'm just learning web stuff so that could be the wrong term. What I mean is, if I've logged into the site shouldn't it know that I'm logged in when I make a request for the page that requires authentication? The HTML I get back does indeed say that I'm not logged in. – kurczynski Aug 18 '14 at 00:42
  • @kurczynski in case you are using browser, then it is correct. But you are logging in with your script and you do not get any information from the server that is related to your session. For the second attempt how can the server know that it is you, but not another person working from the same ip (proxy) ? –  Aug 18 '14 at 01:03
  • @rafaelosipov okay, gotcha. I don't know, that's what I'm trying to figure out lol. My variable `client` is a class variable that I login to the site with, then I try to download the PDF using the same `client` variable which has already been authenticated. That's how I see it, but from testing it, seems like it doesn't. – kurczynski Aug 18 '14 at 01:19
  • @kurczynski I think you have login and password information stored in your application settings. As you do not get the session reference, then just authenticate (login) with your script for every authorized operation on your server. –  Aug 18 '14 at 02:04

1 Answers1

1
HttpClient httpClient = HttpClientBuilder.create().build();
    try {
        HttpGet request = new HttpGet("https://www.cox.com/ibill/PdfBillingStatement.stmt?account13=123&stmtCode=001&cycleDate=7/21/2014&redirectURL=error.cox");
        HttpResponse response = httpClient.execute(request);
        HttpEntity entity = response.getEntity();

        InputStream is = entity.getContent();
        String filePath = "hellow.txt";
        FileOutputStream fos = new FileOutputStream(new File(filePath));
        int inByte;
        while ((inByte = is.read()) != -1)
            fos.write(inByte);
        is.close();
        fos.close();

    } catch (Exception ex) {

    }
Imal Hasaranga Perera
  • 9,683
  • 3
  • 51
  • 41