12

I'm new (but not so new) to SQLAlchemy. I'm using version 0.9.3 for a project. I want to make a query to the SQLite database filtering the result to get those objects with no time stamp or have more than 24 hours since their last update (on the same column).

The problem is, I have no exact idea of how achieve the time filtering part, I did a IPython Notebook session, so people can look how long a tried looking for an answer to my issue:

In [1]:

from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Unicode, DateTime, Integer, create_engine
from sqlalchemy.orm import sessionmaker, relationship, backref
from datetime import datetime

engine = create_engine('sqlite:///:memory:')
Base = declarative_base()
Session = sessionmaker(bind=engine)
session = Session()

In [2]:

class Product(Base):
    __tablename__ = "product"

    id = Column(Integer, primary_key=True)
    name = Column(Unicode(140), unique=True, nullable=False)
    department = Column(Unicode(20))
    added = Column(DateTime, default=datetime.now)
    last_time_parsed = Column(DateTime)

    def repr(self):
        return "<Product(name=%s, department=%s, added=%s, last_time_parsed=%s)>" % (
                            self.name, self.department, self.added, self.last_time_parsed)

In [3]:

# create the tables
Base.metadata.create_all(engine)

In [4]:

# create a false product
p1 = Product(name="Product X")
p2 = Product(name = "Product Y")
p3 = Product(name = "Product Z")

In [5]:

session.add(p1)
session.add(p2)
session.add(p3)
session.commit()

/home/jorge/projects/project1/lib/python2.7/site-packages/sqlalchemy/engine/default.py:525: SAWarning: Unicode type received non-unicode bind param value.
  param.append(processors[key](compiled_params[key]))

In [7]:

q = session.query(Product).filter(Product.last_time_parsed == None).filter(Product.last_time_parsed > 24)

In [9]:

print q.first().name

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-9-377361de0bab> in <module>()
----> 1 print q.first().name

AttributeError: 'NoneType' object has no attribute 'name'

In [14]:

q = session.query(Product).filter(Product.last_time_parsed == None)

In [15]:

print q.first().name

Product X

In [16]:

q = session.query(Product).filter(Product.last_time_parsed < 24)
print q.first().name

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-16-b10ccbb5d88d> in <module>()
      1 q = session.query(Product).filter(Product.last_time_parsed < 24)
----> 2 print q.first().name

AttributeError: 'NoneType' object has no attribute 'name'

In [20]:

q = session.query(Product).filter(Product.added > 24)
print q.first().name

Product X

In []:

In input 20 I know that Product.added is a DateTime data type (and I used just to test because is the only column holding DateTime values). What I don't understand is the logic statement. If were Product.added < 24 (as shown on input 16) instead of Product.added > 24 I would get no object from the database. So, by the number 24 what is actually the database understanding? 24 Minutes? 24 Hours? 24 Seconds? 24 Days?

So, again, How do I filter data to get data that have no time stamp or that was updated more than 24 hours ago?

Thanks :)

shackra
  • 277
  • 3
  • 16
  • 56
  • Instead of the value `24`, Can you not compare to a `datetime` value that is 24 hours before the current time? e.g. `datetime.now() - timedelta(hours=24)` – univerio Mar 15 '14 at 22:33
  • I just discovered that. But I do a something like `onedayold = datetime.timedelta(hours=24)` and then use that on a query to filter all the data that is not older than 24 hours. like: ```q = session.query(Product).filter(Product.added < onedayold); qn = q.all(); print qn``` Which yields ```[]```, correct result since the data added is not older than 24. But, I still miss the step of including every product with `last_time_parsed` column that hold `None` has value. – shackra Mar 15 '14 at 22:47
  • 1
    You cannot compare `added` to a `timedelta` object. What does it mean for 2014-03-15 08:12:33 to be less than 24 hours? You have to compare it to 2014-03-14 15:52:40. Since 2014-03-15 08:12:33 occurs after 2014-03-14 15:52:40, it is greater. So, to find whether a `datetime` occurs within the last 24 hours, you have to do `Product.added > datetime.now() - timedelta(hours=24)`. – univerio Mar 15 '14 at 22:54

2 Answers2

28

You do not need to use union, but simple OR part of the WHERE clause would do:

since = datetime.now() - timedelta(hours=24)
q = (session.query(Product).filter(or_(
            Product.last_time_parsed == None,
            Product.last_time_parsed < since)))
shackra
  • 277
  • 3
  • 16
  • 56
van
  • 74,297
  • 13
  • 168
  • 171
3

So, I found out how to achieve what I wanted. Thanks to @univerio for the corrections!

I'll post the entire IPython Notebook session :)

In [1]:

from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Unicode, DateTime, Integer, create_engine
from sqlalchemy.orm import sessionmaker, relationship, backref
import datetime

engine = create_engine('sqlite:///:memory:')
Base = declarative_base()
Session = sessionmaker(bind=engine)
session = Session()

In [2]:

class Product(Base):
    __tablename__ = "product"

    id = Column(Integer, primary_key=True)
    name = Column(Unicode(140), unique=True, nullable=False)
    department = Column(Unicode(20))
    added = Column(DateTime, default=datetime.datetime.now)
    last_time_parsed = Column(DateTime)

    def repr(self):
        return "<Product(name=%s, department=%s, added=%s, last_time_parsed=%s)>" % (
                            self.name, self.department, self.added, self.last_time_parsed)

In [3]:

# create the tables
Base.metadata.create_all(engine)

In [4]:

# create a false product
twodaysinthefuture = datetime.datetime.now() + datetime.timedelta(hours=48)
p1 = Product(name="Product X")
p2 = Product(name = "Product Y")
p3 = Product(name = "Product Z")
x1 = Product(name = "Product older than 24 hours 1", last_time_parsed=twodaysinthefuture)
x2 = Product(name = "Product older than 24 hours 2", last_time_parsed=twodaysinthefuture)
x3 = Product(name = "Product older than 24 hours 3", last_time_parsed=twodaysinthefuture)

In [5]:

session.add(p1)
session.add(p2)
session.add(p3)
session.add(x1)
session.add(x2)
session.add(x3)
session.commit()

/home/jorge/coders/PPH/clientes/robert_s/playcomscrap/lib/python2.7/site-packages/sqlalchemy/engine/default.py:525: SAWarning: Unicode type received non-unicode bind param value.
  param.append(processors[key](compiled_params[key]))

In [6]:

onedayearly = datetime.timedelta(hours=24)
print onedayearly
since = datetime.datetime.now() - onedayearly

1 day, 0:00:00

In [7]:

q1 = session.query(Product).filter(Product.last_time_parsed < since) # filter-out those rows with less than 24 hours, right?
q2 = session.query(Product).filter(Product.last_time_parsed == None) # I'm interested on those products with no time stamp yet, too.
q3 = q1.union(q2).order_by(Product.id) # aaaaand, done!

In [8]:

# testing
for row in q3.all():
    print row.name, row.last_time_parsed

Product X None
Product Y None
Product Z None
Product older than 24 hours 1 2014-03-17 17:26:03.899875
Product older than 24 hours 2 2014-03-17 17:26:03.899875
Product older than 24 hours 3 2014-03-17 17:26:03.899875

In []:

shackra
  • 277
  • 3
  • 16
  • 56
  • 2
    You do not need to use `union`, but simple `OR` part of the `WHERE` clause would do: `q = (session.query(Product).filter(or_( Product.last_time_parsed == None, Product.last_time_parsed > 24 )) ) ` – van Mar 16 '14 at 19:08