0

I am trying to access an API (Scopus) through python, downloading multiple abstracts within the for loop below:

for t in eid:
    url = "http://api.elsevier.com/content/abstract/eid/"+str(t)+"?view=FULL"
    #    url = "http://api.elsevier.com/content/abstract/eid/2-s2.0-84934272190?view=FULL"
    resp2 = requests.get(url,
                         headers={'Accept':'application/json',
                         'X-ELS-APIKey': MYAPIKEY})

    retrieval = resp2.json()

    dep = retrieval['abstracts-retrieval-response']['item']['bibrecord']['head']['author-group']
    sub = retrieval['abstracts-retrieval-response']['subject-areas']['subject-area']
    iD = retrieval['abstracts-retrieval-response']['coredata']['intid']
    date = retrieval['abstracts-retrieval-response']['coredata']['prism:coverDate']

    department.append(dep)
    subj.append(sub)
    ident.append(iD)
    dates.append(date)

However, upon doing so I keep receiving the following errors along the lines of below (always at different points of the for loop too). I've been told that error handling is a way around this, but being new to Python I have no idea what this is. Can anyone help? Thanks

EDIT: Here is the whole error message which should include all of the correct information (sorry that it is long)

Traceback (most recent call last):
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\packages\urllib3\connection.py", line 142, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\packages\urllib3\util\connection.py", line 91, in create_connection
    raise err
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\packages\urllib3\util\connection.py", line 81, in create_connection
    sock.connect(sa)
ConnectionAbortedError: [WinError 10053] An established connection was aborted by the software in your host machine

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 578, in urlopen
    chunked=chunked)
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 362, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "C:\Users\User\Anaconda3\lib\http\client.py", line 1106, in request
    self._send_request(method, url, body, headers)
  File "C:\Users\User\Anaconda3\lib\http\client.py", line 1151, in _send_request
    self.endheaders(body)
  File "C:\Users\User\Anaconda3\lib\http\client.py", line 1102, in endheaders
    self._send_output(message_body)
  File "C:\Users\User\Anaconda3\lib\http\client.py", line 934, in _send_output
    self.send(msg)
  File "C:\Users\User\Anaconda3\lib\http\client.py", line 877, in send
    self.connect()
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\packages\urllib3\connection.py", line 167, in connect
    conn = self._new_conn()
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\packages\urllib3\connection.py", line 151, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
requests.packages.urllib3.exceptions.NewConnectionError: <requests.packages.urllib3.connection.HTTPConnection object at 0x000002058C7E1C18>: Failed to establish a new connection: [WinError 10053] An established connection was aborted by the software in your host machine

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\adapters.py", line 403, in send
    timeout=timeout
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 623, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\packages\urllib3\util\retry.py", line 281, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
requests.packages.urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='api.elsevier.com', port=80): Max retries exceeded with url: /content/abstract/eid/2-s2.0-84978766692?view=FULL (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x000002058C7E1C18>: Failed to establish a new connection: [WinError 10053] An established connection was aborted by the software in your host machine',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 5, in <module>
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\api.py", line 71, in get
    return request('get', url, params=params, **kwargs)
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\api.py", line 57, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\sessions.py", line 475, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\sessions.py", line 585, in send
    r = adapter.send(request, **kwargs)
  File "C:\Users\User\Anaconda3\lib\site-packages\requests\adapters.py", line 467, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='api.elsevier.com', port=80): Max retries exceeded with url: /content/abstract/eid/2-s2.0-84978766692?view=FULL (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x000002058C7E1C18>: Failed to establish a new connection: [WinError 10053] An established connection was aborted by the software in your host machine',))
R Thompson
  • 353
  • 3
  • 15

1 Answers1

1

Unfortunately you haven't included 'the above exception' that is mentioned in your output.

But in general, when an exceptional situation, such as an error, happens during execution of a piece of code, you can catch that exception and deal with it. An exception is an object with information on board about what went wrong. Exception handling in general is a big subject and you might start reading at: https://docs.python.org/3/tutorial/errors.html

An example of dealing with an exception is just reporting the circumstances and the exception. This could already give you insight in what went wrong:

for t in eid:
    try:
        url = "http://api.elsevier.com/content/abstract/eid/"+str(t)+"?view=FULL"
        #    url = "http://api.elsevier.com/content/abstract/eid/2-s2.0-84934272190?view=FULL"
        resp2 = requests.get(url,
                             headers={'Accept':'application/json',
                             'X-ELS-APIKey': MYAPIKEY})

        retrieval = resp2.json()

        dep = retrieval['abstracts-retrieval-response']['item']['bibrecord']['head']['author-group']
        sub = retrieval['abstracts-retrieval-response']['subject-areas']['subject-area']
        iD = retrieval['abstracts-retrieval-response']['coredata']['intid']
        date = retrieval['abstracts-retrieval-response']['coredata']['prism:coverDate']

        department.append(dep)
        subj.append(sub)
        ident.append(iD)
        dates.append(date)
    except Exception as exception:
        print (url) # 'print' will only work on a console
        print (exception)

[EDIT]

I've taken a look at the error message and it seems that the server that you're trying to connect to, closed the connection. See also Why is host aborting connection? although the cause may be completely different. Try the code above to find out if this happens all the time or only with certain URL's

[/EDIT]

To build in some retries, use:

import time

nrOfTries = 10

for t in eid:
    for count in range (nrOfTries):
        try:
            url = "http://api.elsevier.com/content/abstract/eid/"+str(t)+"?view=FULL"
            #    url = "http://api.elsevier.com/content/abstract/eid/2-s2.0-84934272190?view=FULL"
            resp2 = requests.get(url,
                                 headers={'Accept':'application/json',
                                 'X-ELS-APIKey': MYAPIKEY})

            retrieval = resp2.json()

            dep = retrieval['abstracts-retrieval-response']['item']['bibrecord']['head']['author-group']
            sub = retrieval['abstracts-retrieval-response']['subject-areas']['subject-area']
            iD = retrieval['abstracts-retrieval-response']['coredata']['intid']
            date = retrieval['abstracts-retrieval-response']['coredata']['prism:coverDate']

            department.append(dep)
            subj.append(sub)
            ident.append(iD)
            dates.append(date)

            break   # Don't do the else

        except Exception as exception:
            print ('Problem accessing: {}' .format (url))
            print (exception)
            time.sleep (2)  # Seconds
    else:   # Done after for-loop exhausted, but not if 'break' was encountered
        print ('Gave up accessing: {}' .format (url))

N.B. I haven't tested this, but it should convey the general idea. The 'sleep' is to allow the server to catch its breath...

Community
  • 1
  • 1
Jacques de Hooge
  • 6,750
  • 2
  • 28
  • 45