how to select the second
element using Xpath

Question

I am trying to scrape full reviews from this webpage. (Full reviews - after clicking the 'Read More' button). This I am doing using RSelenium. I am able to select and extract text from the first <p> element, using the code

reviewNodes <- mybrowser$findElements(using = 'xpath', "//p[@id][1]")

which is for less text review.

But not able to extract full text reviews using the code

reviewNodes <- mybrowser$findElements(using = 'xpath', "//p[@id][2]")

or

reviewNodes <- mybrowser$findElements(using = 'xpath', "//p[@itemprop = 'reviewBody']")

It shows blank list elements. I don't know what is wrong. Please help me..

What does the first query return? Is it a single node or a collection? I'd expect, based on the page structure, it would retrieve a collection of all `p` elements whose `id` attribute starts with `"lessReviewContent"`, as those are first `
` children of their parents. Am I right? — CiaPan, Apr 01 '16 at 11:17
even when I type the xpath query "//p[@id][2]" in the "xpath helper" chrome extension, it retrieves the intended text. But the same xpath is not working in the code. Can't think about the reason.... — Rishabh Soni, Apr 02 '16 at 04:29

score 0 · Answer 1 · edited May 23 '17 at 11:45

0

Drop the double slash and try to use the explicit descendant axis:

/descendant::p[@id][2]

(see the note from W3C document on XPath I mentioned in this answer)

edited May 23 '17 at 11:45

Community

1
1

answered Apr 01 '16 at 09:44

CiaPan

9,381
2
21
35

score 0 · Answer 2 · answered Apr 01 '16 at 12:12

As you're dealing with a list, you should first find the list items, e.g. using CSS selector

div.srm

Based on these elements, you can then search on inside the list items, e.g. using CSS selector

p[itemprop='reviewBody']

Of course you can also do it in 1 single expression, but that is not quite as neat imho:

div.srm p[itemprop='reviewBody']

Or in XPath (which I wouldn't recommend):

//div[@class='srm']//p[@itemprop='reviewBody']

If neither of these work for you, then the problem must be somewhere else.

how to select the second element using Xpath

2 Answers2

how to select the second
element using Xpath