RSelenium - how to obtain a node's child node number and their xpaths

Question

I am using RSelenium for web scraping. Now I have an xpath of a certain XML node from a dynamically generated web page. The child nodes are of the same kind. However, I have no a priori knowledge about the number of child nodes. (For instance, when you search for a rare item on a shopping website, you may run into this kind of situation.)

In general, how can I obtain the following information?

1) The numbers of a node's child nodes. 2) The xpath(s) of above. My goal is to apply actions throughout each child nodes (e.g. fill, check or click, depend on what kind the node is).

I see some xpaths using xpath helper in chrome. Then I am completely stuck.

Preferably exemplified using RSelenium. httr + rvest is also acceptable.

score 1 · Accepted Answer · answered Aug 05 '16 at 15:11

A rvest solution would be the following:

require(rvest)
your_xpath = "YOUR XPATH"
doc <- read_html(remDr$getPageSource()[[1]])
children <- doc %>% html_node(xpath=your_xpath) %>% html_children()

Then you can iterate over the children and to to them whatever you like

for (i in 1:length(children)){
  webElem <- remDr$findElement(using = 'xpath', sprintf("%s/*[%d]", your_xpath, i))
  if(classify_node(children[i]) == "click"){
    webElem$$clickElement()
  } else {...}
}

Where does classify_node come from? Dont find it anywhere and google brings me back here — MLEN, Nov 19 '18 at 21:16

RSelenium - how to obtain a node's child node number and their xpaths

1 Answers1