I have made a Scrapy spider that can be successfully run from a script located in the root directory of the project. As I need to run multiple spiders from different projects from the same script (this will be a django app calling the script upon the user's request), I moved the script from the root of one of the projects to the parent directory. For some reason, the script is no longer able to get the project's custom settings in order to pipeline the scraped results into the database tables. Here is the code from the scrapy docs I'm using to run the spider from a script:
def spiderCrawl():
settings = get_project_settings()
settings.set('USER_AGENT','Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)')
process = CrawlerProcess(settings)
process.crawl(MySpider3)
process.start()
Is there some extra module that needs to be imported in order to get the project settings from outside of the project? Or does there need to be some additions made to this code? Below I also have the code for the script running the spiders, thanks.
from ticket_city_scraper.ticket_city_scraper import *
from ticket_city_scraper.ticket_city_scraper.spiders import tc_spider
from vividseats_scraper.vividseats_scraper import *
from vividseats_scraper.vividseats_scraper.spiders import vs_spider
tc_spider.spiderCrawl()
vs_spider.spiderCrawl()