I am currently using Scrapyd to start a crawling spider and the DEPTH_LIMIT setting is set in the Scrapy App settings.
I was wondering how to pass the depth_limit
as a parameter in Scrapyd, allowing me to set it "dynamically" as requested by the user for every different crawling.
I believe I can only act on the spiders and pipelines of Scrapy.
EDIT
Thanks to @John Smith response, I found out it's possible to pass settings to the schedule
method of scrapyd
settings = {
'unique_id': unique_id, # unique ID for database instance
'USER_AGENT': 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)',
'DEPTH_LIMIT': 1
}
# Schedule a new crawling task from scrapyd
task_id = scrapyd.schedule('default', "spider-name", settings=settings, url=url, domain=domain)