0

I have a single Scrapy project with multiple spiders. This project is hosted on a scrapyd instance. I would like to be able to dynamically change settings in the projects settings.py file (such as DOWNLOADER_MIDDLEWARES).

Is it possible to change these settings at the time of sending a request to the scrapyd instance. Note that I don't want to create multiple projects as this will result in duplicating common code across projects.

Thanks

trajan
  • 1,093
  • 2
  • 12
  • 15

1 Answers1

1

You can pass parameters to scrapyd and change settings using the -d argument

curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider -d setting=DOWNLOAD_DELAY=2 -d arg1=val1
dataisbeautiful
  • 546
  • 2
  • 12
  • How about starting scrapy with a different settings file, like Django allow us? – sergiuz Sep 23 '14 at 13:14
  • Thanks, but how would I set a dictionary style setting? I've tried the following `curl http://localhost:6800/schedule.json -d project=GenericCrawl -d spider=Generic -d "setting=ITEM_PIPELINES={'GenericCrawl.pipelines.DefaultValuesPipeline': 299,'GenericCrawl.pipelines.MySQL':300}" ` but I get the following error `[Launcher,4746/stderr] dictionary update sequence element #0 has length 1; 2 is required` – trajan Sep 24 '14 at 06:23
  • I haven't tried but what I would probably do is set a variable that's checked in each middleware and then pass that to scrapyd. Not as elegant but it'll work. – dataisbeautiful Sep 24 '14 at 06:41
  • I've finally got a chance to come back to this. In the end it looks like passing in a "profile" variable and then setting the dictionary settings was the only way to go. Thanks – trajan Oct 12 '14 at 20:47