1

I have a Scrapy project with a web-based interface running on Apache (XAMPP) that allows the user to create, modify and schedule spiders and also includes a call to scrapyd at port 6800 to get the pending/running/finished spiders. It all works superbly with one exception... if scrapyd isn't running, I obviously can't schedule spiders or get the pending/running/finished spiders from scrapyd.

What I currently have is a call to http://localhost:6800 where if that fails, I display a message stating that the server is not currently running and I have a link to start the server. When the link is clicked, it calls a Python page that makes a call to os.system to start the server. I'm fairly new to Python and am still trying to get a grip on OS functions and I've tried a couple of different methods that I've found here on Stack Overflow (such as subprocess.call and os.popen/popen2/popen3) but none of them seem to work.

I know to start the scrapyd using scrapy server you need to be in the scrapy project directory so I also tried calling it using twistd -ny extras/scrapyd.tac

What is the best way for me to either run a scrapy command or make the call to twistd through Python?

0 Answers0