0

I used an advanced search on Scopus to narrow down articles to ones that fit a specific topic; 24,609 documents are returned from the search. I am hoping to download all of the articles as XMLs and then use the 'tm' R package for text-mining to further narrow down the number of papers.

I'm running into issues trying to download the XML files using the Scopus API in R. Ideally, I would like to somehow download all 24,609 XMLs from my search using the rscopus package via the Scopus API. Here is some code I've used to attempt to download a single article:

api_key = get_api_key(NULL, error = FALSE)

if (!is.null(api_key)){
  x = article_retrieval("2-s2.0-50949114517", identifier = "eid",
                        verbose = FALSE, view = "FULL")
  gen = x$content$`full-text-retrieval-response`
  ot = gen$originalText
} else {
  x = article_retrieval("2-s2.0-50949114517",
                        identifier = "eid",
                        api_key_error = FALSE)

This returns an error of "resource not found." I've also attempted this method using the DOI and it fails as well.

While this code only finds one article, is there a way to use the rscopus package to download all articles from a single search? I'm a bit lost on how I could run that using the package. I am able to download the citation info in CSV files for all of the articles, which includes columns for EID and DOI, so it may be possible to apply an article retrieval function to the column.

Using R version 3.5.1, Mac OS X 10.13.6

millie0725
  • 359
  • 2
  • 12

1 Answers1

0

There is a script on GitHub

https://github.com/ElsevierDev/get_sd_oa

to identify all OpenAccess articles in ScienceDirect, and store their URIs in a text file.

That script contains some logic to loop through ISSNs. You might be able to take that script and adapt it to suit your needs.