1

How can I imitate clicking on the download button in the website below from an R session and download the TSV table?

https://comptox.epa.gov/dashboard/chemical_lists

I know there used to be Rselenium and PhantomJS, which are both somehow not up to date anymore and there's apparently V8. However, I can't really wrap my head around using the latter.

andschar
  • 3,504
  • 2
  • 27
  • 35

1 Answers1

1

This site gets the data from a GraphQL API call :

POST https://comptox.epa.gov/dashboard/graphql

And when you click the download button it sends the same data to another api to format the data and download the tsv.

You can get the data from the API and format it into a tsv file in the first place :

library(httr)

query <- "{
    lists { 
        label 
        abbreviation 
        short_description 
        chemical_count 
        updated_at
    }
}"

r <- POST("https://comptox.epa.gov/dashboard/graphql", 
    content_type("application/json"),
    body = list(
        query = query
    ), encode = "json")

data <- content(r, "parsed")

#add link column
for(i in seq_along(data$data$lists)){
  data$data$lists[[i]]$acronym <- paste("https://comptox.epa.gov/dashboard/chemical_lists", data$data$lists[[i]]$abbreviation, sep="/")
}

#convert to dataframe
df <- do.call(rbind.data.frame, data$data$lists)

write.table(df, file = "chemical.tsv", row.names=FALSE, sep="\t", quote = FALSE)

Output of file :

enter image description here

Bertrand Martel
  • 42,756
  • 16
  • 135
  • 159
  • Thanks for pointing me to the API! Haven't seen it at all and it also seems not to be documented. How have you found it? How could I find out what fields it has? – andschar Nov 01 '20 at 16:03
  • 1
    @andschar you can find it by looking at network tabs in Chrome development tool – Bertrand Martel Nov 01 '20 at 17:07