1

I have multiple spiders in a single scrapy project.

I want to write a separate output text file for each spider with spider name and time stamp.

When I had a single spider I was creating file in __init method but now I am trying like this, upromise will generate two output files while other will only one.

class MallCrawlerPipeline(object):

def spider_opened(self, spider):
    self.aWriter = csv.writer(open('../%s_%s.txt' % (spider.name, datetime.now().strftime("%Y%m%d_%H%M%S")), 'wb'),
        delimiter=',', quoting=csv.QUOTE_MINIMAL)
    self.aWriter.writerow(['mall', 'store', 'bonus', 'per_action', 'more_than','up_to', 'deal_url', 'category'])

    if 'upromise' in spider.name:
        self.cWriter = csv.writer(
            open('../%s_coupons_%s.txt' % (spider.name, datetime.now().strftime("%Y%m%d_%H%M%S")), 'wb'),
            delimiter=',', quoting=csv.QUOTE_MINIMAL)
        self.cWriter.writerow(['mall', 'store', 'bonus', 'per_action', 'more_than','up_to', 'deal_url', 'category'])

def process_item(self, item, spider):
    self.aWriter.writerow([item['mall'], item['store'], item['bonus'], item['per_action'],
                           item['more_than'], item['up_to'], item['deal_url'], item['category']])

    return item

But I am facing this bug:

 File "C:\Users\akhter\Dropbox\akhter\mall_crawler\mall_crawler\pipelines.py", line 24, in process_item
    self.aWriter.writerow([item['mall'], item['store'], item['bonus'], item['per_action'],
exceptions.AttributeError: 'MallCrawlerPipeline' object has no attribute 'aWriter'

Any help would be appreciated. Thanks in advance.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
akhter wahab
  • 4,045
  • 1
  • 25
  • 47
  • 1
    Where is aWriter defined? Should this object be inheriting from something other than `object` perhaps? – Silas Ray Mar 07 '12 at 20:42
  • @sr2222 thanks for your reply but, i am very sorry i don't have much grep on python, can you please give me a detailed answer. – akhter wahab Mar 08 '12 at 07:00

2 Answers2

1

Are you sure you're always running obj.spider_opened(...) before obj.process_item(...)? It seems you're not, as after the first method call that attribute should have been added to the object.

If the first method call is always needed perhaps it makes sense to move it to __init__ or call it from there, by the way.

Eduardo Ivanec
  • 11,668
  • 2
  • 39
  • 42
0

thanks guys i found an answer i just need to give a signal otherwise spider_opened never call like this in init method. still open for suggestions

def __init__(self):
    dispatcher.connect(self.spider_opened, signals.spider_opened)
akhter wahab
  • 4,045
  • 1
  • 25
  • 47