How to scrape text using RSelenium in R?

Question

I would like to scrape the text "VIRGINIA TECH" from the site http://stats.statbroadcast.com/statmonitr/?id=102197 using the package RSelenium.

the css selector for the particular text I would like to scrape is:

.valigntop:nth-child(1) .width6-3-4.marginr

After opening the remote driver and navigating to the site I try:

webElem <- remDr$findElement(using = "css selector", '.valigntop:nth-child(1) .width6-3-4.marginr')
doc <- remDr$getPageSource()[[1]]
current_doc <- read_html(doc)
current_doc <- html_text(current_doc)

This returns a big block of text and not the text I want "VIRGINIA TECH".

After scrape what I would like:

current_doc
[1] "VIRGINIA TECH"

Any help will be appreciated. Please let me know if any further information is needed.

the following XPath seems to work pretty well `(//div[contains(@class, "teamname")])[1]` — hrbrmstr, Feb 24 '16 at 13:54
Thank you for your comment. When I try `remDr$findElement(using = "xpath", "(//div[contains(@class, "teamname")])[1]")` it returns an `unexpected symbol` error. — Dre, Feb 24 '16 at 14:27

score 1 · Accepted Answer · answered Feb 24 '16 at 14:54

After reading thru this link I found that this works great to scrape my desired text.

webElems <- remDr$findElements(using = 'css selector', ".valigntop:nth-child(1) .width6-3-4.marginr")
current_doc <- unlist(lapply(webElems, function(x){x$getElementText()}))

Result:

current_doc
[1] "VIRGINIA TECH"

score 1 · Answer 2 · answered Mar 02 '16 at 00:21

1

Simple one.

`webElems <- unlist(remDr$findElements(using = 'css selector', ".valigntop:nth-child(1) .width6-3-4.marginr")$getElementText())`

This works too!!

answered Mar 02 '16 at 00:21

Bharath

1,600
14
25

How to scrape text using RSelenium in R?

2 Answers2