1

I am using the requests module in python to check if an Instagram username is available.

Basically, I add the username in the URL, like so: https://instagram.com/user, with "user" being a variable. However, as soon as I test it out, the function status_code gives me HTTP error 429. I know that error 429 occurs after sending too many requests; but I have tried again with a 48 hour time difference, just to get the same error after one single request.

I have tried the same code with Twitter and it works perfectly fine (even with 20k+ requests in around 2 hours).

Also, using a VPN did not solve the problem.

Could anyone suggest any help? It would be very much appreciated.

Here is the code:

import requests
cnt=0 #counts words

#MAKES WORD LIST
f=open(r'C:\Users\hugop\OneDrive\python\sm_name_checker\dicos\dico_francais_clean.txt',"r")#french dictionnary
content_raw = f.read()
content = list(content_raw.split("\n"))
f.close()
#WRITES AVAILABLE NAMES
g=open(r'C:\Users\hugop\OneDrive\python\sm_name_checker\available_names.txt',"w")#available names will be written into this text doc


for word in content:
    cnt += 1
    f = requests.get("https://instagram.com/{}".format(word))
    print (cnt, word, f.status_code)
    if f.status_code != 200: #200 means the page exists and has been accessed
        g.write("{} {} {}\n".format(cnt, word, f.status_code))
input()
  • Have you tried with a single request? Your IP address might be banned, which most VPN connections also are. – Cow Aug 05 '21 at 13:21
  • @user56700 Yes, I did try a single request but I still had the same error. However, creating a user-agent as neatconda advised fixed the error. Thanks for helping out. – pneumocystosis Aug 05 '21 at 22:30

1 Answers1

0

Instagram doesn't like it when you visit their website without being registered with an account. When you exceed their quota they will shadowban you based on a combination of your IP/deviceID/other factors/... and force you to create an account. This ban can last a couple of days.

You could try to avoid this by setting a user-agent.

headers = {
        'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36',
    }

r = requests.get('https://instagram.com/{}'.format(word), headers=headers)

Potentially, you could further improve this by adding random time delays and making use of proxies.

The easiest solution however is to create an Instagram account and use this pip package: https://github.com/arc298/instagram-scraper

neatconda
  • 76
  • 1
  • 2