Questions tagged [scrapyd]

`Scrapyd` is a daemon for managing `Scrapy` projects. The project used to be part of `scrapy` itself, but was separated out and is now a standalone project. It runs on a machine and allows you to deploy (aka. upload) your projects and control the spiders they contain using a JSON web service.

Scrapyd can manage multiple projects and each project can have multiple versions uploaded, but only the latest one will be used for launching new spiders.

355 questions
0
votes
2 answers

Python, scrapyd installation permision denied, Ubuntu 14.04

I'm a little bit new to python, scrapy, and scrapyd. I want to install scrapy and scrapyd. I installed pip using: sudo apt-get install pip then tried to install scrapyd: pip install scrapyd I'm always getting permission denied error and here is the…
Mohamed Kamal
  • 357
  • 1
  • 2
  • 10
0
votes
1 answer

Unable to Deploy the portia spider in centos7 using scrapyd deploy

I have installed portia and scrapyd. Created new project using Portia web UI - all ok I can able to see the project folder in slyd/data/project/new_project Then I have copied the new_folder to some other different path for deployment. Updated the…
Magendran V
  • 1,411
  • 3
  • 19
  • 33
0
votes
1 answer

Boot up scrapyd failed with default configuration: sqlite3.OperationalError: unable to open database file

I just installed scrapyd on Ubuntu (with apt-get tool). However, without doing any change to the configuration, when I launched "scrapyd" I got the following error: (! 397)-> scrapyd Unhandled Error Traceback (most recent call last): File…
Alex Napitupulu
  • 999
  • 1
  • 7
  • 10
0
votes
0 answers

Not able to Login this website https://www.bestpricewholesale.co.in/Registration/login.aspx in python scrapy project

Not able to Login only this website in python scrapy project. I want to scrap a login require website and i have already login many websites in my project but not able to Login only this website in python scrapy project.I think i have problem in…
Ranvijay Sachan
  • 2,407
  • 3
  • 30
  • 49
0
votes
0 answers

How can i use scrapy shell to with username and password on url ( on login require website )

I want to scrap a login require website and check my xpath right or wrong using scrapy shell in python scrapy-framework like C:\Users\Ranvijay.Sachan>scrapy shell https://www.google.co.in/?gfe_rd=cr&ei=mIl8V 6LovC8gegtYHYDg&gws_rd=ssl …
Ranvijay Sachan
  • 2,407
  • 3
  • 30
  • 49
0
votes
2 answers

where's data store after deploy spider scrapyd python?

i deployed and scheduled my spider on http://localhost:6800/ success, but where's data of item store ? How could i get them ? thanks so much !
tuancoi
  • 35
  • 6
0
votes
1 answer

Scrapy on AWS EC2 : where to write the items?

I have a working spider on my local machine, which writes items to a local postgres database. I am now trying to run the same spider through scrapyd on an EC2 instance. This obviously won't work, because the code (models, pipelines, settings files)…
S Leon
  • 331
  • 1
  • 4
  • 18
0
votes
1 answer

What are the new enhancements in scrapy 0.24.0?

What are the features added to , removed from scrapy 0.24.0?. How does it is differ from the earlier version ?
Niranjan Sagar
  • 819
  • 1
  • 15
  • 17
0
votes
1 answer

Scrapy + Django in production

I'm writing a Django web app that makes use of Scrapy and locally all works great, but I wonder how to set up a production environment where my spiders are launched periodically and automatically (I mean that once a spiders complete its job it gets…
daveoncode
  • 18,900
  • 15
  • 104
  • 159
0
votes
1 answer

Scrapyd Deploy was not successful

My Scrapy.cfg file is [deploy:scra] url = http://localhost:6800/ project = project2 [deploy:scrapyd2] url = http://scrapyd.mydomain.com/api/scrapyd/ project = project1 If I do below command means its throws an error which is given below. $ scrapy…
user4112053
0
votes
1 answer

Dynamic Scrapy settings

I have a single Scrapy project with multiple spiders. This project is hosted on a scrapyd instance. I would like to be able to dynamically change settings in the projects settings.py file (such as DOWNLOADER_MIDDLEWARES). Is it possible to change…
trajan
  • 1,093
  • 2
  • 12
  • 15
0
votes
1 answer

ImportError: Error loading object 'scrap.middlewares.RandomUserAgentMiddleware': No module named scrap.middlewares

I have a portia scrapy project at ~/portia/slyd/data/projects/scrap setup to use scrap.middlewares.RandomUserAgentMiddleware in DOWNLOADER_MIDDLEWARES, RandomUserAgentMiddleware is defined in ~/portia/slyd/data/projects/scrap/middlewares.py. After…
localhost
  • 55
  • 1
  • 6
0
votes
2 answers

Scrapy deploy no longer working

I seem to have run up against an issue with a Scrapy spider deployment that has caused some listening errors, though I haven't been able to use any of the previous answers successfully, either because it's a different issue or the fixes weren't…
Chris
  • 249
  • 5
  • 18
0
votes
1 answer

scrapyd pool_intervel to scheduler a spider

I want to make my spider start every three hours. I have a scrapy confinguration file located in c:/scrapyd folder. I changed the poll_interval to 100 the spider works, but it didn't repeat each 100 seconds. how to do that please?
Marco Dinatsoli
  • 10,322
  • 37
  • 139
  • 253
0
votes
1 answer

Deploying Scrapy Spiders using a Twisted Server

I have +20 scrapy crawlers that I want to deploy manually from a browser webpage. In order to achieve this, I have created a simple twisted server that executes in a shell process the following commands: scrapyd-deploy default -p $project curl…
Hakim
  • 3,225
  • 5
  • 37
  • 75