Questions tagged [scrapyd]

`Scrapyd` is a daemon for managing `Scrapy` projects. The project used to be part of `scrapy` itself, but was separated out and is now a standalone project. It runs on a machine and allows you to deploy (aka. upload) your projects and control the spiders they contain using a JSON web service.

Scrapyd can manage multiple projects and each project can have multiple versions uploaded, but only the latest one will be used for launching new spiders.

355 questions

votes

1 answer

Sharing visited urls between multiple spiders in scrapy?

I am using scrapyd to run multiple spiders as jobs across the same domain. I assumed scrapy had a hashtable of visited urls that it shared and co-ordinated with other spiders when it crawled. When I create instances of the same spider by curl…

asked Apr 13 '14 at 19:03

Sai

votes

1 answer

Error when deploying scrapy project on the scrapy cloud

I am using scrapy 0.20 on Python 2.7. I want to deploy my scrapy project on scrapy cloud I developed my scrapy project with simple spider. navigate to my scrapy project folder. typed scrapy deploy scrapyd -d koooraspider on cmd. Where koooraspider…

python python-2.7 scrapy scrapyd

asked Mar 26 '14 at 15:30

William Kinaan

28,059
20
85
118

votes

1 answer

Running more than one spiders one by one

I am using Scrapy framework to make spiders crawl through some webpages. Basically, what I want is to scrape web pages and save them to database. I have one spider per webpage. But I am having trouble to run those spiders at once such that a spider…

python scrapy scrapyd

asked Feb 11 '14 at 06:07

Nabin

11,216
8
63
98

votes

2 answers

Issues with the installation of scrapyd on Windows

I am having issues with the installation of scrapyd on Wndows 7 I have installed the package using easy_install, but still the command scrapyd comes up with nothing. Here is the output of my install: C:\Python27\Lib\site-packages\scrapy>easy_install…

windows easy-install scrapyd

asked Jan 27 '14 at 15:31

eboni

votes

1 answer

Scrapyd pass parameters when deploying

This is a simple example of a scrapy.cfg file: [settings] default = crawly.settings [deploy:s1] url = http://localhost:6800 project = my_project I wanna know if I could pass any parameters to my scrapyd instance using this file. What I wanna do is…

scrapy scrapyd

asked Jan 01 '14 at 01:18

AliBZ

4,039
12
45
67

votes

1 answer

Log for scrapyd installed with pip

I installed scrapyd with pip, and I don't have a '/var/log/scrapyd' dir. I'm trying to find out what's happening to my http call since I get and 'OK' estatus when I initiate it, but no log is generated in 'logs/project/spider/' (and according to…

python scrapy scrapyd

asked Dec 26 '13 at 12:51

Jean Ventura

votes

0 answers

scrapyd: how to override spider name using cmd arguments

I am using scrapyd (project deployed on ec2 instance of AWS) that accept seed url to start, I want to start each time run spider with different name, so that I can manage items and logs easily on ec2 instance. locally I can do like this crawl…

amazon-ec2 web-scraping scrapyd

asked Dec 21 '13 at 07:36

Tasawer Nawaz

votes

2 answers

Scrapy recursively scraping craigslist

I am using scrapy to scrap craigslist and get all links, go to that link, store the description for each page and email for reply. Now I have written a scrapy script which gors through the craigslist/sof.com and gets all job titles and urls. I want…

python scrapy scrapyd

asked Nov 26 '13 at 02:07

Scooby

3,371
8
44
84

votes

1 answer

getting spider instance from scrapyd

Is there a way to get the instance of the spider that runs when you schedule a run using scrapyd? I need to access attributes in the spider to handle outside the run and can't use a json/csv file to do this.

python scrapy scrapyd

asked Nov 18 '13 at 20:22

Jean Ventura

votes

1 answer

How does scrapyd determine the 'latest' version of a project?

According to the documentations, when deploying a project to scrapyd, I can use the git commit hash as the version, by doing this: $ scrapyd-deploy default -p myproject --version GIT The documentation also says that scrapyd can keep multiple…

scrapy scrapyd

asked Nov 14 '13 at 03:51

Kal

1,707
15
29

votes

1 answer

How do I call spiders from different projects with different pipelines from a python script?

I have a three different spiders in different scrapy projects called REsale, REbuy and RErent, each with their own pipeline that directs their output to various MySQL tables on my server. They all run OK when called using scrapy crawl. Ultimately,…

python api windows-7 scrapy scrapyd

asked Nov 10 '13 at 03:05

Mark

votes

1 answer

Scrapyd Post schedule.json from asp.net

I have scrapyd and spider installed on a Unix machine, and everything works fine when I run curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider I can see the job status,logs and items on web interface of scrapyd…

c# asp.net scrapy scrapyd

asked Nov 07 '13 at 07:12

Syed Waqas

votes

1 answer

How to install the latest Scrapyd package?

I notice that the latest stable version of scrapy was released last week(2013-08-09). After updating scrapy to version 0.18, the previous installed scrapyd-0.17 was uninstalled by apt-get(Ubuntu 12.04) automatically. Is there a scrapyd-0.18? How to…

python ubuntu-12.04 scrapy apt-get scrapyd

asked Aug 12 '13 at 02:06

kev

155,172
47
273
272

votes

1 answer

How to install scrapyd on freeBSD

I am trying to install scrapyd on freeBSD but, I am getting this error: $ cd /usr/ports/www/py-scrapyd/ && sudo make install clean -bash: cd: /usr/ports/www/py-scrapyd/: No such file or directory I have installed scrapy using this command : $ cd…

python scrapy freebsd scrapyd

asked Jun 10 '13 at 08:39

Vaibhav Jain

5,287
10
54
114

votes

1 answer

Run Scrapy on IIS

I have an IIS server and on it I have an ASP.NET MVC application. The MVC application will revolve around Scraped data. Is there a way I can run Scrapy (a tool built in Python) on IIS? Simliar to how we can run PHP and WordPress on IIS.

iis scrapy scrapyd

asked Mar 24 '13 at 18:39

J86

14,345
47
130
228

Prev 1 2 3

…

24 Next