1

First time I installed scrapyd in Ubuntu 14.04, I didn't use the generic way.

Using apt-get, my scrapyd was considered a service that can be started and have (log/config/dbs...) dependencies however the scrapy version was very outdated.

So I installed scrapyd with pip in virtualenv. Although it is up to date, I can't start scrapyd as a service and I can't find any dependencies. Where do I create the Configuration file to link (eggs/dbs/items/log) dependencies ?

I have more than 10 spiders. Using a remote Ubuntu server, I want each spider to scrape periodically (once a weak for example) and send the data into mangodb. Most of the spiders don't have to scrape simultaneously.

What is the best approach to run scrapyd as a service and run its spiders periodically in my Ubuntu server?

user2243952
  • 277
  • 3
  • 6
  • 12
  • Not sure I get your question or not... but you can run `scrapyd` in background ... and then schedule your spider in Cron like this `curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider` – Umair Ayub Mar 15 '17 at 15:58
  • run scrapyd as a background task (i have found screen to be useful) with a supervisor (supervisord) – Verbal_Kint Mar 15 '17 at 22:47

0 Answers0