0

In java I want to read and save all the HTML from an URL(instagram), but getting Error 429 (Too many request). I think it is because I am trying to read more lines than request limits.

StringBuilder contentBuilder = new StringBuilder();
try {
    URL url = new URL("https://www.instagram.com/username");
    URLConnection con = url.openConnection();
    InputStream is =con.getInputStream();
    BufferedReader in = new BufferedReader(new InputStreamReader(is));
    String str;
    while ((str = in.readLine()) != null) {
        contentBuilder.append(str);
    }
    in.close();
} catch (IOException e) {
    log.warn("Could not connect", e);
}
String html = contentBuilder.toString();

And the Error is so;

Could not connect
java.io.IOException: Server returned HTTP response code: 429 for URL: https://www.instagram.com/username/

And it shows also that error occurs because of this line

InputStream is =con.getInputStream();

Does anybody have an idea why I get this error and/or what to do to solve it?

Hossein Golshani
  • 1,847
  • 5
  • 16
  • 27
Fikret
  • 25
  • 1
  • 5
  • *I think it is because I have trying to read more lines than request limits* ⬅ You seem to have already answered your question on your own… Anyway, the following seem possibly relevant: https://stackoverflow.com/questions/33477861/how-to-avoid-instagram-error-429-the-maximum-number-of-requests-per-hour-has-bee, https://stackoverflow.com/questions/33435965/instagram-the-remote-server-returned-an-error-429-unknown-status-code, https://stackoverflow.com/questions/49583489/did-instagram-change-api-rate-limits-on-mar-30-2018 – sideshowbarker Sep 28 '18 at 08:29
  • See also https://stackoverflow.com/questions/49606300/instagram-api-request-limit-max-200-only-2018-april and https://stackoverflow.com/questions/37416195/instagram-api-rate-limits-and-taking-down-the-client – sideshowbarker Sep 28 '18 at 08:29
  • And i am thinking you have hit some sort of server side connection cap ;P – Antoniossss Sep 28 '18 at 08:38

1 Answers1

2

The problem might have been caused by the connection not being closed/disconnected. For the input try-with-resources for automatic closing, even on exception or return is usefull too. Also you constructed an InputStreamReader that would use the default encoding of the machine where the application would run, but you need the charset of the URL's content. readLine returns the line without line-endings (which in general is very useful). So add one.

StringBuilder contentBuilder = new StringBuilder();
try {
    URL url = new URL("https://www.instagram.com/username");
    URLConnection con = url.openConnection();
    try (BufferedReader in = new BufferedReader(
                new InputStreamReader(con.getInputStream(), "UTF-8"))) {
        String line;
        while ((line = in.readLine()) != null) {
            contentBuilder.append(line).append("\r\n");
        }
    } finally {
        con.disconnect();
    } // Closes in.
} catch (IOException e) {
    log.warn("Could not connect", e);
}
String html = contentBuilder.toString();
Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • Hi, thanks for response but I couldnt understand where should the URLConnection be – Fikret Oct 01 '18 at 07:29
  • Then I get an Incompatible Type Error in URLConnection. Required: java.lang.AutoCloseable Found: java.net.URLConnection – Fikret Oct 01 '18 at 07:46
  • Dumb mistake on my side; an SQLConnection is AutoCloseable. An URLConnection uses `disconnect()`. Corrected code. – Joop Eggen Oct 01 '18 at 07:58
  • Too many requests _could_ stem from the number of connections. That would take a half hour pause for the timeouts to close connections normally. **Add logging**, so that it is certain that the code is not executed many times, say during repaints. – Joop Eggen Oct 01 '18 at 08:22
  • I found the problem. In html there are just last 12 posts. That's why I cant get more in that way. – Fikret Oct 04 '18 at 08:00
  • You might try to reduce the frequency of the requests, like keeping each request for half a second before sending; if meanwhile another is intended to be sent, bundle them into one; like that. Or **keep-alive** might have an influence. – Joop Eggen Oct 04 '18 at 08:07