4

I have a Flask project running with a subprocess call to a Scrapy spider:

class Utilities(object):

    @staticmethod
    def scrape(inputs):
        job_id = str(uuid.uuid4())
        project_folder = os.path.abspath(os.path.dirname(__file__))
        subprocess.call(['scrapy', 'crawl', "ExampleCrawler", "-a", "inputs=" + str(inputs), "-s", "JOB_ID=" + job_id],
                            cwd="%s/scraper" % project_folder)
        return job_id

Even though I have 'Attach to subprocess automatically while debugging' enabled in the projects' Python debugger, the breakpoints inside the spider won't work. The first breakpoint which works again would be one on return job_id.

Here is a part of the code from the spider where I expect the breakpoints to work:

from scrapy.http import FormRequest
from scrapy.spiders import Spider
from scrapy.loader import ItemLoader
from Handelsregister_Scraper.scraper.items import Product
import re


class ExampleCrawler(Spider):
    name = "ExampleCrawler"

    def __init__(self, inputs='', *args, **kwargs):
        super(ExampleCrawler, self).__init__(*args, **kwargs)
        self.start_urls = ['https://www.example-link.com']
        self.inputs = inputs

    def parse(self, response):
        yield FormRequest(self.start_urls[0], callback=self.parse_list_elements, formdata=self.inputs)

I can't find any solution to this other than enabling said option which i did.

Any suggestions how to get breakpoints inside the spider work?

systderr
  • 75
  • 8

2 Answers2

0

debugger doesn't work because is not a subprocess, but an external call. see this answer for possible workarounds.

Community
  • 1
  • 1
valepert
  • 96
  • 5
0

Try configuring this python running configuration in "Edit Configurations". After that, run in debug mode.

Example

Ráfagan
  • 2,165
  • 2
  • 15
  • 22