1

I am trying to web scrape the csv generated by clicking the 'CSV' button on this site (located below the left side of the graph). The problem is that the CSV is generated from Javascript that parses the embedded table (below middle of graph). I am aware I could just scrape the embedded table . I do not want to do this.

I want a solution that uses R to download a csv from the 'CSV' button. My attempt so far uses the r packages "rvest" and "V8". I'm just confused on what I do as I can't find good examples of the V8 package being used on javascript download buttons. Here's what I got so far.

I'm confused on the line ct <- v8() onwards... How do i apply the V8 package in the context of the javascript in the source code of the above URL?

library(rvest)
library(V8)

URL <- https://www.bankofengland.co.uk/boeapps/database/fromshowcolumns.asp?Travel=NIxSTxTIxSUx&FromSeries=1&ToSeries=50&DAT=RNG&FD=1&FM=Jan&FY=2009&TD=30&TM=Mar&TY=2020&FNY=&CSVF=TT&html.x=91&html.y=29&C=IIN&Filter=N#

raw_html <- GET(URL) %>% 
     content() %>% 
     rvest::html_nodes("script") %>% 
     html_text() 

 ct <- v8() 

   read_html(ct$eval(gsub('document.ready','',raw_html))) %>% 

   html_text()

For Context, the javascript for the buttons are as below

<script>
$(document).ready(function() {
    $.fn.dataTable.moment( 'DD MMM YY' );

    $('#stats-table').DataTable( {
    paging: false,
    "info":     false,
    "order": [ 0, 'desc' ],
    fixedHeader: {
    header: true
    },
     dom: 'Bfrtip',
        buttons: [
            'copy', 'csv', 'excel', 'print'
        ]
    } );

} );
</script>
StarScream2010
  • 241
  • 1
  • 5

1 Answers1

0

A RSelenium solution,

library(RSelenium)
driver = rsDriver(browser = c("firefox"))

remDr <- driver[["client"]]
# Naigate to website
remDr$navigate("https://www.bankofengland.co.uk/boeapps/database/fromshowcolumns.asp?Travel=NIxSTxTIxSUx&FromSeries=1&ToSeries=50&DAT=RNG&FD=1&FM=Jan&FY=2009&TD=30&TM=Mar&TY=2020&FNY=&CSVF=TT&html.x=91&html.y=29&C=IIN&Filter=N#")
#Download the CSV file
button_element <- remDr$findElement(using = 'xpath', value = '//*[@id="stats-table_wrapper"]/div[1]/a[2]')
button_element$clickElement()
Nad Pat
  • 3,129
  • 3
  • 10
  • 20