-1

Trying to get the results of a form using R, this method used to work with the previous url: https://ec.europa.eu/taxation_customs/vies/viesquer.do

Here for VAT number FR23489967794.

library(rvest)
library(httr)

headers = c(
  "User-Agent" = "Safari/537.36",
  "Accept" = "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"
)

params = list(
  "ms" = "FR",
  "iso" = "FR",
  "vat" = "23489967794",
  "name" = "",
  "companyType" = "",
  "street1" = "",
  "postcode" = "",
  "city" = "",
  "requesterMs" = "FR",
  "requesterIso" = "FR",
  "requesterVat" = "23489967794",
  "BtnSubmitVat" = "Verify"
)

r <- httr::GET(url = "https://ec.europa.eu/taxation_customs/vies/viesquer.do", httr::add_headers(.headers=headers), query = params)
r |> content() |> html_element('.validStyle') |> html_text()

However, now that they changed their URL to https://ec.europa.eu/taxation_customs/vies/#/vat-validation, I am not able to get this to work (no .validStyle element in the response), any help much appreciated.

gaut
  • 5,771
  • 1
  • 14
  • 45
  • I don't know if it's possible with `httr`, `httr2` or `rvest` but I'm almost sure you can do this with [`RSelenium`](https://docs.ropensci.org/RSelenium/index.html) – bretauv Aug 29 '22 at 12:26
  • It does appear the current site now uses javascript to perform the request. The `rvest` and `httr` packages cannot execute javascript. You'll either need to reverse engineer the site to see if you can find where it's pulling the data now, or you RSelenium which can run javascript for you. – MrFlick Aug 29 '22 at 15:17

1 Answers1

1

I might misunderstand, but can you not just replicate the call to their internal API to request this data? Try using the network analysis button on the website when you submit the form and then check for any jsons.

This would give you the following url: https://ec.europa.eu/taxation_customs/vies/rest-api/ms/FR/vat/23489967794?requesterMemberStateCode=FR&requesterNumber=23489967794

I would not recommend using selenium for it, because the overhead is unnecessary here.

In R you can then execute:

httr::GET("https://ec.europa.eu/taxation_customs/vies/rest-api/ms/FR/vat/23489967794?requesterMemberStateCode=FR&requesterNumber=23489967794") %>% 
httr::content(as = "text") %>% 
jsonlite::fromJSON()

You just have to replace the VAT in the get request for any other result.

Datapumpernickel
  • 606
  • 6
  • 14