0

I'm trying to scrape a site with RSelenium. Normally there are 10 elements on the page that I want to scrape, but sometimes some of them are missing in which a list of 5, 6 or 7 is returned instead of the 10. For example:

This code returns an list of 6 on the page (four elements are missing)

`webElems_title2 <- remDr$findElements(using = "xpath", value = "//div[property='title']`")

Whereas this code returns an list of 10 on the same page: (all 10 elements are scraped)

webElems_doc_title <- remDr$findElements(using = "xpath",value = "//a[@class = 'doc-title']")
                                  

My question: How can I create an if-statement that inserts NA if the specific element is not present?. My end goal is that both codes above return an list of 10.

Inspired by this post: Inserting NA in blank values from web scraping. I've tried doing something like:

webElems_title2 <- remDr$findElements(using = "xpath", value = "//div[@property = 'title']") %>% replace(!nzchar(.),NA)

Inspired by this post: Inputting NA where there are missing values when scraping with rvest I've tried something like this:

webElems_title2 <- remDr$findElements(using = "xpath", value = "//div[@property = 'title']") %>% {if(length(.) == 0) NA else .}

But it doesn't seem to work. I hope someone can help me.

Earl Mascetti
  • 1,278
  • 3
  • 16
  • 31

1 Answers1

0

You could use tryCatch function.

Below a possible solution:

Your scrape code...

#The variable webElems_title2 
tryCatch(expr ={
#scrapes information of 'webElems_title2 '
webElems_title2 <- remDr$findElements(using = "xpath", value = "//div[@property = 'title']")$getElementAttribute('value')
},   
#If the information does not exist in this way you write NA to the webElems_title2 element
error = function(e){          
webElems_title2 <-NA
})

Your scrape code...
Earl Mascetti
  • 1,278
  • 3
  • 16
  • 31