R: webscraping a table

Question

I am trying to webscrape a Live Exchange Rates webpage. I tried:

library(XML)
webpage  <- "http://liveindex.org/"

tables <- readHTMLTable(webpage )
n.rows <- unlist(lapply(tables, function(t) dim(t)[1]))

But I get an error message.

Thank you for any help.

Hi, I am not an expert but indeed it seems that it is not XML. For comparison you could look at the [ECB](https://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml?93aad09b8f8b7bdb69cd1574b5b2665f) website which is XML. I you are interested I could share code how to source rates from there. Concerning the topic of exchange rates I recommend [this](http://stackoverflow.com/questions/26694042/how-to-get-currency-exchange-rates-in-r) question. — An economist, May 09 '16 at 09:24
I dont know if you can do this for tick by tick data. but here is something you can start off with.reviews <- link %>% read_html() %>% html_nodes("#menu_content .inline_rates_container"). Im getting a NA if i try to extract the value. — Chirayu Chamoli, May 09 '16 at 10:02
_"You may not use any computerised or automatic mechanism, including without limitation, any Web scraper, spider or robot, to access, extract and/or download any information, including without limitation, any currency exchange data, from the Web Site or the Tools"_ — hrbrmstr, May 09 '16 at 10:59

score 0 · Answer 1 · answered Dec 12 '21 at 16:46

I was able to extraction the content of the table in the form of a character vector (Note : I used Windows for this example).

library(RDCOMClient)
library(stringr)
IEApp <- COMCreate("InternetExplorer.Application")
IEApp[['Visible']] <- TRUE
IEApp$Navigate("http://liveindex.org/")
Sys.sleep(5)
doc <- IEApp$Document()
Sys.sleep(5)
inner_Text <- doc$documentElement()$innerText()

inner_Text_Splitted <- strsplit(inner_Text, "\n")[[1]]
inner_Text_Splitted <- inner_Text_Splitted[nchar(inner_Text_Splitted) < 1000]
inner_Text_Splitted <- inner_Text_Splitted[inner_Text_Splitted != "\r"]
inner_Text_Splitted <- inner_Text_Splitted[inner_Text_Splitted != " \r"]
inner_Text_Splitted <- inner_Text_Splitted[inner_Text_Splitted != "   \r"]

# More cleaning required but the information of the table is in the variable inner_Text_Splitted

More cleaning is required of the variable inner_Text_Splitted, but the information is there. Also, you could achieve a similar result with the R package RSelenium.

R: webscraping a table

1 Answers1