I'm following this tutorial https://www.practicalecommerce.com/Monitor-Competitor-Prices-with-Python-and-Scrapy exactly how is said, step-by-step, but when I get to the part where I run the spider with the command:
scrapy crawl massEffect -o results.csv
it shows this mistake:
NameError: global name 'TfawItem' is not defined
What am I doing wrong?
Here's my items.py:
# -*- coding: utf-8 -*-
# Define here the models for your scraped items
#
# See documentation in:
# http://doc.scrapy.org/en/latest/topics/items.html
import scrapy
class TfawItem(scrapy.Item):
# define the fields for your item here like:
# name = scrapy.Field()
title = scrapy.Field()
price = scrapy.Field()
upc = scrapy.Field()
url = scrapy.Field()
my massEffect.py:
# -*- coding: utf-8 -*-
import scrapy
class MasseffectSpider(scrapy.Spider):
name = 'massEffect'
allowed_domains = ['tfaw.com']
start_urls = [
'http://www.tfaw.com/Companies/Dark-Horse/Series?series_name=Mass+Effect',
]
def parse(self, response):
for href in response.css('div a.boldlink::attr(href)'):
url = response.urljoin(href.extract())
yield scrapy.Request(url, callback=self.parse_detail_page)
def parse_detail_page(self, response):
comic = TfawItem()
comic['title'] = response.css('div.iconistan + b span.blackheader::text').extract()
comic['price'] = response.css('span.blackheader ~ span.redheader::text').re('[$]\d+\.\d+')
comic['upc'] = response.xpath('/html/body/table[1]/tr/td[4]/table[3]/tr/td/table/tr/td[contains(., "UPC:")]/following-sibling::td[1]/text()').extract()
comic['url'] = response.url
yield comic
And the hierarchy of my project:
tfaw/
scrapy.cfg
results.csv
tfaw/
__init__.py
__init__.pyc
items.py
middlewares.py
pipelines.py
settings.py
settings.pyc
spiders/
__init__.py
__init__.pyc
massEffect.py
massEffect.pyc