Questions tagged [scrapy-shell]

The Scrapy shell is an interactive shell where you can try and debug your scraping code very quickly, without having to run the spider.

It’s meant to be used for testing data extraction code, but you can actually use it for testing any kind of code as it is also a regular Python shell.

177 questions
1
vote
0 answers

scrapy splash css selector not getting data

I have this web: https://analisi.transparenciacatalunya.cat/es/Energia/Certificats-d-efici-ncia-energ-tica-d-edificis/j6ii-t3w2 and am trying to get the total number of rows from the table that is at the end of the web (1.341.412) . the selector is…
1
vote
1 answer

Scrapy doesn't print anything

Can somebody help me by telling me what is the error in my code? I write "scrapy crawl provincia -o table_data_results.csv" in the cmd but the excel is empty. I think it isn't scraping anything. from scrapy import Spider from scrapy.http import…
1
vote
1 answer

Scrapy Returns Inconsistent Results

I'm trying to scrape an Amazon product page but scrapy is giving me inconsistent results (sometimes it returns what I want and sometimes it returns None). I have no idea as to why the same code give different results. I created a loop that yield the…
Avn
  • 31
  • 1
1
vote
2 answers

CSS selector of link to the next page returns empty list in Scrapy shell

I'm new in Scrapy. I try to get link to the next page from this site https://book24.ru/knigi-bestsellery/?section_id=1592 What how html looks like: enter image description here In scrapy shell I wrote this…
1
vote
2 answers

Scrapy crawl loop for next page

Hello I am trying to get into word scrapers and crawlers however I don't understand why my code is not going to the next page and looping. import scrapy from scrapy import* import scrapy from scrapy import* class…
ZIXR
  • 13
  • 2
1
vote
1 answer

Scrapy shell gives an output of empty list even if the xpath is correct in chrome.Why?

Executed on Scrapy shell url = "https://www.daraz.com.np/smartphones/?spm=a2a0e.11779170.cate_1.1.287d2d2b2cP9ar" fetch(url) r = scrapy.Request(url = url) fetch(r) response.xpath("//div[@class='ant-col-20 ant-col-push-4…
1
vote
2 answers

Why Scrapy FormRequest is redirecting in scrapy shell while it is working perfectly in developer console of browser

I am trying to access historical data of this page from 01/01/2018 date in scrapy shell. After analysis,I figured out that the form data of request is like this In [124]: form Out[124]: {'action': 'historical_data', 'curr_id': '44765', …
1
vote
1 answer

Scrapy shell cannot find response object

I am new to Scrapy and trying to follow this tutorial (https://www.pythongasm.com/introduction-to-scrapy/) in order to learn about it. I scraped this page (https://newyork.craigslist.org/d/real-estate/search/rea) using the fetch command, but when i…
guiparo
  • 11
  • 1
1
vote
3 answers

Scrapy extracting
  • with span inside
  • I'm trying to extract the text from this html structure:
    List of Birds
    • Crow Black
    Goundo
    • 95
    • 10
    1
    vote
    0 answers

    How to prevent Scrapy Shell from redirecting

    I am trying to scrape data from a search query in Lulu and Georgia's website. This is the link to the search query when I searched up "desks":…
    skrockz
    • 11
    • 1
    1
    vote
    2 answers

    How to completely exit Scrapy shell?

    I run my shell using inspect_response() function. I'd like to exit Scrapy shell, so I use Ctrl-D (or Ctrl-Z in Windows) to do this. However, I cannot completely do this, because Spider crawls consecutive URLs, so new Scrapy shells are executed. Do…
    Dominik Lenda
    • 41
    • 1
    • 8
    1
    vote
    1 answer

    scrapy response : twisted.internet.error.TCPTimedOutError: TCP connection timed out: 10060

    I'm scraping data from a website since 3 months ago, but today I can't access the website anymore, neither with my web-browser. The site is still accessible via mobile phone. I have this message when I test a link in Scrapy shell :…
    1
    vote
    1 answer

    Parsing through response created with XPath

    Using Scrapy, I want to extract some data from a HTML well-formed site. With XPath I am able to extract a list of items, but I am not able to extra data from the elements in the list, using XPath All XPath's have been tested using XPather. I have…
    user7322345
    1
    vote
    1 answer

    Scrapy FormRequest can't handle complex dicts as formdata

    I am trying to provide formdata to a scrapy.FormRequest object. The formdata is a dict of the following structure: { "param1": [ { "paramA": "valueA", "paramB": "valueB" } ] } via equivalent to the following code, run in…
    1
    vote
    1 answer

    Problem with __VIEWSTATE, __EVENTVALIDATION, __EVENTTARGET and scrapy & splash

    How do i handle __VIEWSTATE, __EVENTVALIDATION, __EVENTTARGET with scrapy/splash? I tried with return FormRequest.from_response(response, [...] '__VIEWSTATE': response.css( 'input#__VIEWSTATE::attr(value)').extract_first(), But this…
    Jurko
    • 11
    • 1
    1 2
    3
    11 12