I am checking an Instagram page existence by urlopen('https://www.instagram.com/profile-name')
. Getting a profile page when it exists, and 404 error if not. That's a perfect flow.
But the Instagram request limit is reached fast. It is per-ip, so I need to change IP. For this I've tried Tor. And... it gets broken, when I start doing urlopen()
through Tor connection - getting the Instagram login page disregarding profile existence, so I cannot distinct existing/non-existing profiles. What may be reason for such behavior and how to fix it?
Here is the sample code. Run in python3
. USE_TOR
constant will switch Tor on/off. To install socks
run in terminal pip3 install requests requests[socks]
and pip3 install pysocks
.
You need to install Tor before use it.
import urllib.request
from urllib.error import HTTPError
import socks
import socket
USE_TOR = True
def createConnection(address, timeout = None, source_address = None):
sock = socks.socksocket()
sock.connect(address)
return sock
def getIp():
with urllib.request.urlopen("http://httpbin.org/ip") as page:
return str(page.read()).replace('\n', '')
#
print("Normal IP: " + getIp())
# Set up tor
if USE_TOR:
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)
socket.socket = socks.socksocket
socket.create_connection = createConnection
print("Tor IP: " + getIp())
# Request page
try:
page = urllib.request.urlopen('https://www.instagram.com/a')
print("Profile exists")
except HTTPError as e:
print("Profile does not exist. Http error " + str(e.code))
Terminal output:
USE_TOR = True
Normal IP: b'{\n "origin": "my ip"\n}\n'
Tor IP: b'{\n "origin": "158.174.122.199, 158.174.122.199"\n}\n'
Profile exists
USE_TOR = False
Normal IP: b'{\n "origin": "my ip"\n}\n'
Profile does not exist. Http error 404
*"my ip"
differs from the Tor one.