1

I have the following two Scrapy projects with the following configurations

The Project1's scrapy.cfg

[settings]
default = Project1.settings
[deploy]
url = http://localhost:6800/
project = Project1

[scrapyd]
eggs_dir    = eggs
logs_dir    = logs
logs_to_keep = 500
dbs_dir     = dbs
max_proc    = 5
max_proc_per_cpu = 10
http_port   = 6800
debug       = off
runner      = scrapyd.runner
application = scrapyd.app.application

and Project2's scrapy.cfg

[settings]
default = Project2.settings
[deploy]
url = http://localhost:6800/
project = Project2

[scrapyd]
eggs_dir    = eggs
logs_dir    = logs
logs_to_keep = 500
dbs_dir     = dbs
max_proc    = 5
max_proc_per_cpu = 10
http_port   = 6800
debug       = off
runner      = scrapyd.runner
application = scrapyd.app.application

but when I take look at http://localhost:6800/jobs I always see just 8 items are in running, it means default max_proc_per_cpu is not applied, I delete the projects with the following commands

curl http://localhost:6800/delproject.json -d project=Project1

curl http://localhost:6800/delproject.json -d project=Project2

and deploy them again to make sure new changes are deployed. but the running spiders number still is 8 .

my VPS CPU has two cores. I could get it with python -c 'import multiprocessing; print(multiprocessing.cpu_count())'.

how can I get the Scrapyd deployed configuration? how can I set Max process per cpu?

Yuseferi
  • 7,931
  • 11
  • 67
  • 103
  • As for me `max_proc` should be much bigger than `max_proc_per_cpu` . I think it has to be `max_proc = max_proc_per_cpu * number_of_cores`. In documentation for [max_proc](https://doc.scrapy.org/en/0.12/topics/scrapyd.html#max-proc) you can see: `"max_proc - The maximum number of concurrent Scrapy process that will be started. If unset or 0 it will use the number of cpus available in the system mulitplied by the value in max_proc_per_cpu option."` – furas Dec 09 '17 at 21:41
  • @furas yes you right, but I've tested it when I set `max_proc = 0` but the result was like this, another note, max running is 8 now , not 10, if it was 10 you theory comes true. – Yuseferi Dec 10 '17 at 05:27
  • Have you restarted scrapyd after updated configurations? – oste-popp Jan 17 '18 at 19:44
  • @oste-popp, yes, rebuild scrapyd projects, rebuild new one with new `cfg` configuration and restart scrapyd. – Yuseferi Jan 17 '18 at 20:01
  • I may be wrong, but i dont think you can set the configuration in scrapy you have to directly change it in the scrapyd cfg, you can find the file here `http://scrapyd.readthedocs.io/en/stable/config.html` – oste-popp Jan 17 '18 at 20:11

1 Answers1

1

According to the documentation, in Unix-like systems, the configuration file is first looked upon in the /etc/scrapyd/scrapyd.conf location. I entered the configuration file here, but it did not work. Finally, it worked when I kept the scrapy.conf file as a hidden file in the directory from which the scrapy server started. For me, it happened to be the home directory.

You can read about the details here: https://scrapyd.readthedocs.io/en/stable/config.html

Tarif Ezaz
  • 11
  • 1
  • 2