2

My objective is to use pyquery with scrapy, apparently from scrapy.selector import PyQuerySelector returns ImportError: cannot import name PyQuerySelector when I crawl the spider.

I followed this specific gist https://gist.github.com/joehillen/795180 to implement pyquery.

Any suggestions or tutorials that can help me get this job done?

  • 2
    This gist is linked to this (closed) Pull Request: https://github.com/scrapy/scrapy/pull/358/files . You'll have to apply this patch or perhaps contact the author (https://gist.github.com/joehillen) – paul trmbrth Jan 21 '14 at 10:31
  • 1
    Why not use just `pq = PyQuery(response.body)`? – R. Max Jan 22 '14 at 18:59

1 Answers1

1

You declare a class and make your rules and in the callback attribute of rule extractor give parse_item by default the scrapy goes parse() function

def parse_item(self, response):
    pyquery_obj = PyQuery(response.body)
    header = self.get_header(pyquery_obj)
    return {
        'header': header,
    }


def get_header(self, pyquery_obj):
    return pyquery_obj('#page_head').text()
marr75
  • 5,666
  • 1
  • 27
  • 41
Raghu
  • 313
  • 2
  • 7
  • 19