1

I have the source code follow as:

//Spider
class test_crawler(BaseSpider):
    name = 'test'
    allowed_domains = ['http://test.com']
    start_urls = ['http://test.com/test']

    def parse(self, response):
        hxs = HtmlXPathSelector(response)
        question_info = hxs.select('//div[contains(@class, "detail")]')
        answer_info = hxs.select('//div[contains(@class, "doctor_ans")]')

        row_for_question = question_info.select('table/tr/td')
        qna = QnaItem()
        qna['title'] = question_info.select('h2/text()').extract()
        qna['category'] = row_for_question[3].select('a/text()').extract()
        qna['question'] = row_for_question[7].select('text()').extract()
        qna['answer'] = answer_info.select('p[contains(@class,"MsoNormal")]/span/span/span/font/text()').extract()
        return qna

//Pipeline
class XmlExportPipeline(object):

    def __init__(self):
        dispatcher.connect(self.spider_opened, signals.spider_opened)
        dispatcher.connect(self.spider_closed, signals.spider_closed)
        self.files = {}

    def spider_opened(self, spider):
        file = open('%s_products.xml' % spider.name, 'w+b')
        self.files[spider] = file
        self.exporter = XmlItemExporter(file)
        self.exporter.start_exporting()

    def spider_closed(self, spider):
        self.exporter.finish_exporting()
        file = self.files.pop(spider)
        file.close()

    def process_item(self, item, spider):
        self.exporter.export_item(item)
        return item

When I run in shell command (scrapy shell http://test.com/test), it works fine. I don't receive any error. However, when I run the command "scrapy crawl test", I encounter an error below:

Traceback (most recent call last):
  File "C:\Python27\lib\site-packages\twisted\internet\base.py", line 11
78, in mainLoop
    self.runUntilCurrent()
  File "C:\Python27\lib\site-packages\twisted\internet\base.py", line 80
0, in runUntilCurrent
    call.func(*call.args, **call.kw)
  File "C:\Python27\lib\site-packages\twisted\internet\defer.py", line 3
68, in callback
    self._startRunCallbacks(result)
  File "C:\Python27\lib\site-packages\twisted\internet\defer.py", line 4
64, in _startRunCallbacks
    self._runCallbacks()
--- <exception caught here> ---
  File "C:\Python27\lib\site-packages\twisted\internet\defer.py", line 5
51, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "E:\Projects\tysk-osqa\osqa\scrapy\qna_crawler\spiders\qna.py", l
ine 14, in parse
    question_info = HtmlXPathSelector(response).select('//div[contains(@
class, "detail")]')
  File "C:\Python27\lib\site-packages\scrapy-0.14.4-py2.7.egg\scrapy\sel
ector\dummysel.py", line 16, in _raise
    raise RuntimeError("No selectors backend available. " \
exceptions.RuntimeError: No selectors backend available. Please install
libxml2 or lxml

It's not true because I have already installed both libxml2 and lxml. I download and install the binary package (64 bit) from http://www.lfd.uci.edu/~gohlke/pythonlibs/ In addition, I can import lxml and libxml2 from cmd successfully.

Please help me to resolve this problem.

Thank you so much.

Thinh Phan
  • 655
  • 1
  • 14
  • 27
  • Try to debug putting some prints in `C:\Python27\lib\site-packages\scrapy-0.14.4-py2.7.egg\scrapy\sel ector\__init__.py` – warvariuc Aug 16 '12 at 04:04

2 Answers2

0

You need to install the 32-bit versions of libxml and libxml2 and note that when you install the binaries for Windows; they are only installed for the system Python (the one found in the registry).

Burhan Khalid
  • 169,990
  • 18
  • 245
  • 284
  • Hi Burhan, I currently run the 64-bit version of Python. So, I can't install 32-bit version of lxml and libxml2. Do you confirm lxml and libxml2 don't run on 64-bit system? – Thinh Phan Aug 16 '12 at 05:25
-1

I think you have not set the virtualenv to install the libxml2 , lxml .

try: pip install lxml

and add lxml into requirements.txt

You knows who
  • 885
  • 10
  • 18
  • Hi Suoinguon, I can't install lxml and libxml2 by the command "pip install lxml" from windows. I don't have the compiler for C/C++. I just install lxml and libxml2 from the binary package. – Thinh Phan Aug 16 '12 at 05:31