1

Executed on Scrapy shell

url = "https://www.daraz.com.np/smartphones/?spm=a2a0e.11779170.cate_1.1.287d2d2b2cP9ar"
fetch(url)
r = scrapy.Request(url = url)
fetch(r)
response.xpath("//div[@class='ant-col-20 ant-col-push-4 c1z9Ut']/div[@class='c1_t2i']/div[@class='c2prKC']/div/div/div/div[@class='c16H9d']/a/text()").getall()

##NOTE##

There is no tbody tag in xpath Why it outputs an empty list in scrapy thought it has 40 text in chrome?

renatodvc
  • 2,526
  • 2
  • 6
  • 17

1 Answers1

1

It's because the website is heavily javascript orientated. That means content on the website is being loaded dynamically. It's invoking HTTP requests as the page loads and it's not hard coded into the HTML. So when you use scrapy shell it's not loading the HTML.

Couple of suggestions

  1. Try to re-engineer the HTTP Requests. That is javascript envokes HTTP requests and therefore if you can mimic the requests can you get the data you want. YOu will need to use chrome dev tools or similar to see how the requests are made. This is the most clean and concise way to get data. All other options will slow the spider down and are more brittle.

  2. Scrapy-splash - This prerenders the DOM of the page and allows you to access the HTML you desire.

  3. Scrapy-selenium - A downloader middleware that handles requests with selenium. Not got the full function of selenium package but can render the DOM and you could get the data you require.

  4. Embed selenium into the scrapy spider. It's the worst choice and really should be only used as last resort.

Please see the docs on dynamic content for a bit more detail here

AaronS
  • 2,245
  • 2
  • 6
  • 16
  • Thank you sir But i'm just a noobie and i have tired to go in developers tool in chrome and try to find the data source but i'm not able to do it. Can you help in that sir?? I'm trying to follow the suggestion number 1. – Laxman Maharjan Jul 25 '20 at 02:53
  • So having had a look a the website, the first option isn't possible. There isn't any structured data that has the information you desire. I suggest you start looking at the other options. It isn't a coding service and therefore I want you to try and create something either by the options you see or please see the Scrapy Documents for anything you think i may have missed. – AaronS Jul 25 '20 at 05:57