I am working with the companies house API to extract several thousands profiles of companies.
The normal ratelimit is 600/5 minutes, but can be extended to 1200/5 minutes. I used this script last week and it was working fine for several hours, now i keep getting a 429 error.
I think the script is fine, but i might be missing something with the decorators from the ratelimit
and backoff
libraries, maybe someone is more familiar with those libraries and sees a logic error I am not seeing.
# api_funcs.py module
import requests
from requests import ConnectionError
from ratelimit import limits, sleep_and_retry
from backoff import on_exception, expo
from pipeline_tools.helpers import get_key
KEY = get_key("API_key")
FIVE_MINUTES = 300 # Number of seconds in five minutes.
@sleep_and_retry # if we exceed the ratelimit imposed by @limits forces sleep until we can start again.
@on_exception(expo, ConnectionError, max_tries=5)
@limits(calls=1200, period=FIVE_MINUTES)
def call_api(url, api_key):
r = requests.get(url, auth=(api_key, ""))
if not (r.status_code == 200 or r.status_code == 404):
r.raise_for_status()
elif r.status_code == 404:
return dict({"error": "not found"})
else:
return r.json()
def company_basic_search(comp_code):
return call_api(url=API_BASE_URL+"/company/"+comp_code, api_key=KEY)
# [list of 200,000 company codes]
comp_codes = ['XXX1','XXX2','XXX3']
for code in comp_codes:
basic_profile_resource = company_basic_search(comp_code=code)
# if-elif-else flow inserting the object in error table if 404 or in other table if 200.
I keep getting a
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: ...
Is there anything wrong with my logic or this is probably on the API side?