1

I am trying to download the full text PDF versions from the Elsevier API. I am able to download the whole paper in XML, JSON and plain text form. So, the API key is working fine. However, I am not able to download the full text in PDF form. When I try to change the header to accept the PDF files, it only writes the first page of the article

I tried on a lot of different DOI's but all of them return the first page of the article.

This is the request command that I am using to access the paper

import requests
r = requests.get('http://api.elsevier.com/content/article/doi/10.1016/0038-1098(87)90044-5?httpAccept=application/pdf', headers=headers)

And I am writing using the following code

with open('test.pdf','wb') as f:
    f.write(r.content)

There is no error but the test.pdf is only the first page of the article.

ozshikh
  • 11
  • 2
  • where is documentation for this request ? Is there information about PDF ? – furas Oct 23 '19 at 02:35
  • Here is a useful link about requests: https://realpython.com/python-requests/ I am able to download PDF articles from other journals using this command but somehow this api is not giving the expected result – ozshikh Oct 23 '19 at 03:04
  • I'm asking about API documentation, and request/url which you use to get data - `http://api.elsevier.com/...`. Where is documentation for `http://api.elsevier.com/...` ? Is there information about PDF? – furas Oct 23 '19 at 03:07
  • Ohh! Sorry for the confusion. Here is the documentation of the API. https://dev.elsevier.com/documentation/ArticleRetrievalAPI.wadl There is information about the PDF in the link – ozshikh Oct 23 '19 at 14:05
  • in documentation I found few times this text only for PDF format: `For PDF documents setting this flag to true will result in being redirected to the Author Manuscript version of the resource whenever the requestor is NOT entitled to the full content of the PDF.` SO there are some restrictions for PDF. You should ask API admins – furas Oct 23 '19 at 14:30
  • Ok. I will mail the API admins. Thanks for your help – ozshikh Oct 23 '19 at 16:32
  • there is also [interactive documentation](https://dev.elsevier.com/retrieval.html#!/Article_Retrieval/ArticleRetrieval) and you can test API with different arguments. You can also get `curl` command for tested API and use page https://curl.trillworks.com/ to convert to Python's `requests` – furas Oct 23 '19 at 17:28

1 Answers1

2

By default, the article retrieval API (https://dev.elsevier.com/documentation/ArticleRetrievalAPI.wadl) allows full-text retrieval of articles in XML or JSON format, not in PDF format (except for open access content, where full text is available in all formats). For non-OA content, only the first page of the PDF is available by default.