1

I am having trouble constructing a query using SQLalchemy. Here is a simplified representation of the models I have defined:

Models

Project

class Project(Base):

    __tablename__ = 'project'

    id = Column(Integer, primary_key=True)
    name = Column(String, nullable=False, unique=True)

    # User associations
    users = relationship(
        'User',
        secondary='user_project_association'
    )

User

class User(Base):

    __tablename__ = 'user'

    id = Column(Integer, primary_key=True)
    name = Column(String, nullable=False, unique=True)

    # Project associations
    projects = relationship(
        'Project',
        secondary='user_project_association'
    )

User <-> Project (association)

class UserProjectAssociation(Base):

    __tablename__ = 'user_project_association'

    # User association.
    user_id = Column(Integer, ForeignKey('user.id'), primary_key=True)
    user = relationship('User', backref='project_associations')

    # Project association.
    project_id = Column(Integer, ForeignKey('project.id'), primary_key=True)
    project = relationship('Project', backref='user_associations')

Query

I want to perform a query on the projects table such that the result contains information about the projects as well as information about the associated users - if there are any. I am including a filter based on the user name. I am eventually going to send the result as JSON via a REST API so I would prefer the results as python {dict} objects rather than SQLAlchemy objects. The query I am performing looks like:

# Add return fields
query = session.query(
    Project.id,
    Project.name,
    User.id.label('users.id'),
    User.name.label('users.name')
)

# Add join statements
query = query.outerjoin(User, Project.users)

# Add filters
query = query.filter(
    Project.name == 'proj1', 
    User.name != 'jane.doe'  # <--- I think this is causing the issue.
)

# Execute
results = query.all()
data = [result._asdict() for result in results]
print(data)      

Results

The database contains a project called proj1 which doesn't have any associated users. In this particular scenario, I am filtering on a user column and the user association does not exist. However, I am still expecting to get a row for the project in my results but the query returns an empty list. The result I am expecting would look something like this:

[{'id': 1, 'name': 'proj1', 'users.id': None, 'users.name': None}]

Can someone explain where I am going wrong?

Yani
  • 1,465
  • 2
  • 16
  • 25

1 Answers1

4

You have to account for the NULL values that result from the left join, since != compares values and NULL is the absence of value, so the result of NULL != 'jane.doe' is NULL, not true:

query = query.filter(
    Project.name == 'proj1',
    or_(User.name == None, User.name != 'jane.doe')
)

Note that SQLAlchemy handles equality with None in a special way and produces IS NULL. If you want to be less ambiguous you could also use User.name.is_(None).

Ilja Everilä
  • 50,538
  • 7
  • 126
  • 127
  • Thanks! One more thing - the filters are produced programmatically by my API so it sounds like I’ll need to add the NULL filter (as shown in your post) whenever there is a filter on an association column. Does that sound right? To account for situations where someone actually wants to perform an IS NOT NULL filter (which doesn’t apply in this case because the column is not nullable but I’ll still need to account for it), I might do something like: or_(expr, Model.id == None) where expr is the filter expression added by the user and the second one is added automatically. Make sense? – Yani Nov 23 '17 at 21:32
  • Not entirely sure if I follow your train of thought, but depending on what your API allows to do you can't just bolt a `or_(Something.something == None, ...)` there. What if the predicate from the user is `Something.something == 'dark side'`? – Ilja Everilä May 21 '18 at 18:40