Super late and completely self promotion here, but I think this could help someone. I just wrote a little scrapy extension to save scraped items to a database. scrapy-sqlitem
It is super easy to use.
pip install scrapy_sqlitem
Define Scrapy Items using SqlAlchemy Tables
from scrapy_sqlitem import SqlItem
class MyItem(SqlItem):
sqlmodel = Table('mytable', metadata
Column('id', Integer, primary_key=True),
Column('name', String, nullable=False))
Add the following pipeline
from sqlalchemy import create_engine
class CommitSqlPipeline(object):
def __init__(self):
self.engine = create_engine("sqlite:///")
def process_item(self, item, spider):
item.commit_item(engine=self.engine)
Don't forget to add the pipeline to settings file and create the database tables if they do not exist.
http://doc.scrapy.org/en/1.0/topics/item-pipeline.html#activating-an-item-pipeline-component
http://docs.sqlalchemy.org/en/rel_1_1/core/tutorial.html#define-and-create-tables