0

I am trying to webscrape a Live Exchange Rates webpage. I tried:

library(XML)
webpage  <- "http://liveindex.org/"

tables <- readHTMLTable(webpage )
n.rows <- unlist(lapply(tables, function(t) dim(t)[1]))

But I get an error message.

Thank you for any help.

adam.888
  • 7,686
  • 17
  • 70
  • 105
  • 1
    Hi, I am not an expert but indeed it seems that it is not XML. For comparison you could look at the [ECB](https://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml?93aad09b8f8b7bdb69cd1574b5b2665f) website which is XML. I you are interested I could share code how to source rates from there. Concerning the topic of exchange rates I recommend [this](http://stackoverflow.com/questions/26694042/how-to-get-currency-exchange-rates-in-r) question. – An economist May 09 '16 at 09:24
  • I dont know if you can do this for tick by tick data. but here is something you can start off with.reviews <- link %>% read_html() %>% html_nodes("#menu_content .inline_rates_container"). Im getting a NA if i try to extract the value. – Chirayu Chamoli May 09 '16 at 10:02
  • 1
    _"You may not use any computerised or automatic mechanism, including without limitation, any Web scraper, spider or robot, to access, extract and/or download any information, including without limitation, any currency exchange data, from the Web Site or the Tools"_ – hrbrmstr May 09 '16 at 10:59
  • Thanks you very much for the info. I have change the link – adam.888 May 10 '16 at 09:19

1 Answers1

0

I was able to extraction the content of the table in the form of a character vector (Note : I used Windows for this example).

library(RDCOMClient)
library(stringr)
IEApp <- COMCreate("InternetExplorer.Application")
IEApp[['Visible']] <- TRUE
IEApp$Navigate("http://liveindex.org/")
Sys.sleep(5)
doc <- IEApp$Document()
Sys.sleep(5)
inner_Text <- doc$documentElement()$innerText()

inner_Text_Splitted <- strsplit(inner_Text, "\n")[[1]]
inner_Text_Splitted <- inner_Text_Splitted[nchar(inner_Text_Splitted) < 1000]
inner_Text_Splitted <- inner_Text_Splitted[inner_Text_Splitted != "\r"]
inner_Text_Splitted <- inner_Text_Splitted[inner_Text_Splitted != " \r"]
inner_Text_Splitted <- inner_Text_Splitted[inner_Text_Splitted != "   \r"]

# More cleaning required but the information of the table is in the variable inner_Text_Splitted

More cleaning is required of the variable inner_Text_Splitted, but the information is there. Also, you could achieve a similar result with the R package RSelenium.

Emmanuel Hamel
  • 1,769
  • 7
  • 19