requests.get is very slow

Question

I am trying to resolve a DOI like this:

import requests                                     
url = 'https://dx.doi.org/10.3847/1538-4357/aafd31'
r1 = requests.get(url)                          
actual_url = r1.url

But the requests.get call actually takes of the order of 10s of seconds up to 5 minutes (it varies)! I tried stream=True or verify=False but that does not really help.

sounds like an issue with that site or the server its on or something — SuperStew, Feb 11 '20 at 14:48
what do you get when you ping this site? What if you use a proxy? depending on the site and findings it could be that they are slowing you down on purpose — Chrisvdberge, Feb 11 '20 at 14:52
^This. Maybe the site doesn't like to be scrapped? Try changing your useragent sent with the request. — h4z3, Feb 11 '20 at 14:54

score 1 · Answer 1 · answered Feb 11 '20 at 14:55

1

try:

import urllib.request
response = urllib.request.urlopen('https://dx.doi.org/10.3847/1538-4357/aafd31')
html = response.read()

answered Feb 11 '20 at 14:55

GSBYBF

160
3

I tested and this is indeed faster. Do you know why? – Ahmet Feb 11 '20 at 14:56
https://stackoverflow.com/questions/37135880/python-3-urllib-vs-requests-performance – GSBYBF Feb 11 '20 at 14:58
@GSBYBF That question doesn't appear to be related, is that what you meant to link? – AMC Feb 11 '20 at 19:32

score 1 · Accepted Answer · answered Feb 11 '20 at 14:58

It seems they are slowing you down on purpose. Try setting a valid user agent. Below code runs ok (quick response) for me;

import requests
url = 'https://dx.doi.org/10.3847/1538-4357/aafd31'
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'
}

req = requests.get(url, headers=headers)

print(req.text)

If you are doing multiple requests just make sure you do it slow enough and possibly use multiple user agents at random

score -1 · Answer 3 · answered Nov 09 '20 at 06:08

-1

I had the same problem. My solution is to create a new environment with more recent python version.

answered Nov 09 '20 at 06:08

Adjie Buanafijar

1

how can you surely say that without knowing what version Jonh is using? – CognitiveRobot Nov 09 '20 at 08:14

requests.get is very slow

3 Answers3