Questions tagged [scrapy-shell]

The Scrapy shell is an interactive shell where you can try and debug your scraping code very quickly, without having to run the spider.

It’s meant to be used for testing data extraction code, but you can actually use it for testing any kind of code as it is also a regular Python shell.

177 questions
0
votes
0 answers

Response.css() is giving no results for pagination in scrapy crawler after login

I want to read 'title' of list of projects which are in pagination & 335 records almost. What i am trying to do is : 1) First I get response of the browser by this command in windows cmd: scrapy shell https://www.slingshotinsights.com/projects 2)It…
0
votes
0 answers

Selenium with error Traceback (most recent call last): File "", line 1, in fb_login()

I have the following code that help me auto fill in my data and login: import webbrowser from selenium import webdriver import time def fb_login(): br=webdriver.Chrome('C:/Python34/Scripts/chromedriver.exe') …
user6315578
0
votes
0 answers

scrapy with DNSCACHE_ENABLED=False not working

When i run scrapy shell with DNSCACHE_ENABLED=False got KeyError: 'dictionary is empty' twisted.internet.error.DNSLookupError: DNS lookup failed: no results for hostname lookup: www.mydomain.com. 2017-07-03 03:09:12 [twisted] CRITICAL: while looking…
softwarevamp
  • 827
  • 10
  • 14
0
votes
2 answers

unable to fetch the list values from the website

i fetch all the detail from the desire website but unable to get the some specific information please guide me for that. targeted domain: https://shop.adidas.ae/en/messi-16-3-indoor-boots/BA9855.html my code…
Zia
  • 394
  • 1
  • 3
  • 13
0
votes
1 answer

Change value of an HTML element with scrapy

I am trying to scrape data from this website: Website link. I want to download all the PDF files from specific dates. While I've managed to get the files from the first page and download them correctly, I cannot change the date so I can go back in…
Stavros G
  • 45
  • 6
0
votes
1 answer

python scrapy 302 (I want to back the original page)

I am gonna to scrape https://movie.douban.com/subject/1292052/ this page but the url redirect to http://m.douban.com/movie/subject/1292052 how did I back to the first page and use the first page's parse way(xpath) to go on? thanks!
ileadall42
  • 631
  • 2
  • 7
  • 19
0
votes
1 answer

How to scrape next page's items

Hello i am new in programming and scrapy. Trying to learn scrapy i try scrape some items. but unable to do the scrape next page item, please help how parse next link url for this web site. Here is my code: import scrapy from scrapy.linkextractors…
Samsul Islam
  • 2,581
  • 2
  • 17
  • 23
0
votes
1 answer

Post request with scrapy not redirecting properly?

I'm trying to extract some data from http://www.bcpa.com using scrapy. I have some addresses and I want to extract from the website the info associated to each one of the addresses, so I need to "search by address" through this urls…
abeagomez
  • 562
  • 1
  • 4
  • 16
0
votes
0 answers

Scrapy Shell Splash doesn't render correctly

I try to render javascript page with splash in scrapy shell. I want to render Google's search result with: scrapy shell 'http://localhost:8050/render.html?url=https://www.google.com.tr/#q=christian+omlin+email&timeout=10&wait=0.5' but shell return…
0
votes
2 answers

Retreive http return code from ImagesPipeline (or MediaPipeline) in scrapy

I have a working spider scraping image URLs and placing them in image_urls field of a scrapy.Item. I have a custom pipeline that inherits from ImagesPipeline. When a specific URL returns a non-200 http response code (like say a 401 error). For…
hAcKnRoCk
  • 1,118
  • 3
  • 16
  • 30
0
votes
1 answer

Below POST Method is not working in scrapy

I have tried with headers, cookies, Formdata and body too, but i got 401 and 500 status code. In this site First Page is in GET method & gives HTML response and further pages are in POST method & gives JSON response. But these status codes arrives…
Vimal Annamalai
  • 139
  • 1
  • 2
  • 12
0
votes
1 answer

how to scrape product names from website using scrapy shell

Please help me scrape product names from this link: http://www.gap.com/browse/category.do?cid=5168&scrollTo=product353401012&scrollTo=product353401012#pageId=0&department=75 The product names are contained in class="product-card--name" which is in a…
Light
  • 143
  • 1
  • 6
0
votes
1 answer

Scrapy: scrape items from HTML and not from URL

I came across Scrapy with requirement of crawling and scraping both. But according to application requirement I decided not to go with Monolithic approach. Everything should be service based. So I decided to design two services. Get all urls and…
SangamAngre
  • 809
  • 8
  • 25
0
votes
1 answer

Scrapy installation in Ubuntu: pkg_resources.DistributionNotFound: attrs

I installed scrapy by following the tutorial here, the installation was success but once I try to setup a project by it shows pip install Scrapy nikhil@nikhil:~$ scrapy startproject tutorial Traceback (most recent call last): File…
3lokh
  • 891
  • 4
  • 17
  • 39
0
votes
2 answers

Can't Get Image src link with XPath

I am using Scrapy to crawl the product image src link of this site: http://eshop.tesco.com.my/en-GB/Promotion/List?SortBy=Default For some reasons, the Xpath doesn't grab the product image src links. I tried to crawl all the image src links from the…
Tatt Ehian
  • 79
  • 1
  • 7
1 2 3
11
12