-3

Org.jsoup.HttpStatusException : HTTP error fetching URL . Status =429 that shows when i parsed 900 urls at once...and the message stays for a while like 1 hour or more ..is there any solution to this problem ? Or a way to detect the error before hapening ?

  • 1
    Welcome to Stack Overflow! Can you please share a [Minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve) so we can reproduce your issue? – Samuel Philipp Jun 08 '19 at 22:57
  • It seems your target host has a rate limit, ideally these limits are for an IP address, either adhre to the limit or request the target host to have an exception for you. If they allow ethically You may want try to rotate your IP address – Raj Jun 09 '19 at 08:00

1 Answers1

4

TL;DR

You have been rate limited.


Is there any solution to this problem?

  1. Read the terms and conditions of the site you are scraping to find out:

    1. whether scraping is permitted
    2. if it is permitted, what request rate is acceptable.
  2. If 1.1 is "no", stop trying to scrape the site.

  3. Otherwise, implement your code to stay under the prescribed rate limits. For example, when your scraper reaches the permitted request limit, have it sleep and then resume sending requests in the next metered time period.

Or a way to detect the error before it happens?

No. The site most likely won't give you any indication other than the 429 response. (But you could check their documentation ....)

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216