2

I am trying to use Scrapy some info from amazon.co.uk by using absolute path as below. but strangely no value returned. I am quite new to Scrapy:

scrapy shell http://www.amazon.co.uk/product-reviews/B0042EU3A2/

response.xpath('//*[@id="productReviews"]/tbody/tr/td[1]/a[1]/@name').extract()

I would like it to return name attribute in this case is: RI4HGFJCSI04W.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
W.S.
  • 647
  • 1
  • 6
  • 19

1 Answers1

3

Just omit the tbody from the expression:

In [1]: response.xpath('//*[@id="productReviews"]//tr/td[1]/a[1]/@name').extract()
Out[1]: [u'RI4HGFJCSI04W']
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • Thanks! but why "tbody" causing troubles? – W.S. Apr 28 '15 at 23:35
  • 2
    @W.S. the problem is that `tbody` element doesn't exist in the HTML source for that page. You probably saw it in the browser DOM, but that's because the browser creates a `tbody` element automatically for all tables. – Elias Dorneles Apr 29 '15 at 00:25