0

I managed to find the attribute I want to isolate using the debugging spider, but i'm not sure if incorporated it into my spider correctly. I dont get an explicit error message when the spider runs, so i'm thinking I just entered the selector inccorectly.

The website i'm crawling is "http://www.smiling-moose.com/events/index.php" The path command i type into the debugging spider is "response.xpath('//div[@class="show_sec_button"]/text()')", which pulls the exact response i'm looking for.

Here is my spider:

import scrapy

from smiling_moose.items import SMItem

class Smspider (scrapy.Spider):
    name = "smspider"
    allowed_domains = ["http://www.smiling-moose.com/index.php"]
    start_urls = [
         "http://www.smiling-moose.com/events/index.php",
    ]

def parse(self, response):
    for sel in response.xpath('//div'):
        item = SMItem()
        item['desc'] = response.xpath('//*[@class="show_sec_band"]/text()').extract()

Here is my Items.py:

import scrapy


class SMItem(scrapy.Item):
    desc = scrapy.Field()

Is there anything I need to change in the spider? I can post my command prompt error if needed.

Thank you

1 Answers1

0

First change the allowed_domains:

allowed_domains = ["smiling-moose.com"]

Second, return the item:

item['desc'] = response.xpath('//*[@class="show_sec_band"]/text()').extract()
yield item
Djunzu
  • 498
  • 2
  • 12