0

I am currently trying to access a multitude of very similar websites that contain only text, and are all formatted in the same way.

The first 5-30 times I call this method, it works, but after that it returns null. No error. Is there any reason that it would not be able to get the string?

After a bit of text inserting, I found that randomly, it seems, line = in.readLine() is null, and it skips the body of the string grabbing. I don't use BufferedReader very much, so it could well be the problem. If you have any tips, or ways I could troubleshoot this, It would be greatly appreciated.

public static String pullString(int id) throws IOException {

    String price;
    logError("Checkpoint 1");
    URL url = new URL("example.com");
    URLConnection con = url.openConnection();
    logError("Checkpoint 2");
    BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
    logError("Checkpoint 3");
    String line;

    while ((line = in.readLine()) != null) {
  //        ^ for some reason this becomes null, but on identical pages it works fine. 


              //Removed unneeded info


                return ---;
            } catch (NumberFormatException e) {
                logError("NumberFormatException");

                return null;
            }
        }
    }
    logError("left out the back");
    return null;
}
Mysticial
  • 464,885
  • 45
  • 335
  • 332
Echocage
  • 270
  • 2
  • 4
  • 14

2 Answers2

0

I doubt it has anything to do with BufferedReader.

It is more likely the server is returning an error message which you are ignoring. I suggest you print the contents of what is returned to ensure there is no error in the header.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • Unfortunately i have tried that, but I tried it once more, no luck, thanks for trying though – Echocage Sep 05 '12 at 08:27
  • 1
    What did you get? No lines at all? If you got something, can you post what you get when it stops prematurely? Perhaps the first 5 - 30 lines contains something you missed. – Peter Lawrey Sep 05 '12 at 08:29
  • That appears to be your output. Are you sure you are logging everything you get from the server as well? If you are, it appears you are getting nothing, the server is dropping the connection without sending you anything. It could be doing this because a) your URL is malformed or b) the server is overloaded. – Peter Lawrey Sep 05 '12 at 08:36
  • Anytime my URL is malformed, I do get a very clear error, and I do not believe that the server is overloaded, but who knows you might be right. So those are the only two reasons that in.readLine() would return null? Is it possible that the page hasn't loaded yet, so it returns null? Hmmm... I really appreciate the help peter. – Echocage Sep 05 '12 at 08:40
  • You get a `null` because there is no more output. If this happens without any lines, the server is failing to send you any data. The BufferedReader waits for the data to be sent, no matter how long that takes, it doesn't time-out and will wait forever if the server doesn't disconnect. – Peter Lawrey Sep 05 '12 at 08:42
  • I'm not sure if this is related, but for some reason, when this null appears, specifically this kind, and not a malformed url, it breaks out of my foreach loop skipping the other 400 some items left. I'm not sure if this is the sign of an unhandled error or just user error, but I do have all my null-checks in order... – Echocage Sep 05 '12 at 08:54
  • If its jumping out of code unexpectedly, it suggest you are getting an Exception or error which you might not be recording properly. – Peter Lawrey Sep 05 '12 at 08:56
  • Not sure I have mentioned this, but once this happens, it seems to refuse to make any other connections until I restart the program. It just throws the same error every time I try to connect from then on. – Echocage Sep 05 '12 at 08:57
  • I'm thinking you're probably right, I'll go over my code with a fine toothed comb once more. – Echocage Sep 05 '12 at 08:58
  • Would it be throwing any errors other than the ones that I am forced to catch? Because I am catching and printing all the errors that I see, and the results haven't changed... – Echocage Sep 05 '12 at 09:06
  • I've added catch(Throwable t){t.printStackTrace();} Unfortunately its still not printing anything more... :/ – Echocage Sep 05 '12 at 09:36
0

It is possible that the connection is dropped before you finish reading. I suggest that inside the while loop only read the lines, then close the connection and do the logic offline. you can also try to work directly with the input stream and read bytes instead of strings, and then convert it offline.

iGili
  • 823
  • 7
  • 18