0

I have a project in Scrapy with ~10 spiders, I run a few of them simultaneously using Scrapyd. However, I have doubts whether my CONCURRENT_REQUESTS setting is correct.

Currently my CONCURRENT_REQUESTS is 32, but I have seen that they recommend that this value be much higher (>= 100). But I have a question, is it the total number of concurrent requests that all the running spiders can make or is it the number of concurrent requests that a single spider can make?

I'm assuming it's the number of concurrent requests that all spiders can make and that's why they recommend that it be as high as possible. And I see that I can regulate the number of requests each spider will make using CONCURRENT_REQUESTS_PER_DOMAIN.

Jalil SA
  • 26
  • 3
  • I don't have any experience with scrapyd, however I can say with certainty that the CONCUURRENT_REQUESTS option in the scrapy settings applies to each spider individually. – Alexander Dec 22 '22 at 01:30
  • @Alexander Thank you for your prompt response. You are right, each spider gets the value of `CONCURRENT_REQUESTS`. I will do some tests with scrapyd, and as soon as I have the results I will add the answer to this question. – Jalil SA Dec 26 '22 at 18:56

1 Answers1

0

Scrapyd can manage multiple projects, each of which contains multiple spiders. CONCURRENT_REQUESTS operates per-project (i.e. for all spiders in that project).

Reference: issue #463

Jalil SA
  • 26
  • 3