9

Say I have the following models:

class Department(Base):
    __tablename__ = 'departments'
    id = Column(Integer, primary_key=True)

class Employee(Base):
    __tablename__ = 'employees'
    id = Column(Integer, primary_key=True)
    department_id = Column(None, ForeignKey(Department.id), nullable=False)
    department = relationship(Department, backref=backref('employees'))

Sometimes, when I query departments, I would also like to fetch the number of employees they have. I can achieve this with a column_property, like so:

Department.employee_count = column_property(
    select([func.count(Employee.id)])
    .where(Employee.department_id == Department.id)
    .correlate_except(Employee)
)

Department.query.get(1).employee_count # Works

But then the count is always fetched via a subquery, even when I don't need it. Apparently I can't ask SQLAlchemy not to load this at query time, either:

Department.query.options(noload(Department.employee_count)).all()
# Exception: can't locate strategy for <class 'sqlalchemy.orm.properties.ColumnProperty'> (('lazy', 'noload'),)

I've also tried implementing this with a hybrid property instead of a column property:

class Department(Base):
    #...
    
    @hybrid_property
    def employee_count(self):
        return len(self.employees)

    @employee_count.expression
    def employee_count(cls):
        return (
            select([func.count(Employee.id)])
            .where(Employee.department_id == cls.id)
            .correlate_except(Employee)
        )

With no luck:

Department.query.options(joinedload('employee_count')).all()
# AttributeError: 'Select' object has no attribute 'property'

I know I can just query the count as a separate entity, but I need it often enough that I'd really prefer the convenience of having it as an attribute on the model. Is this even possible in SQLAlchemy?

Edit: To clarify, I want to avoid the N+1 problem and have the employee count get loaded in the same query as the departments, not in a separate query for each department.

Sasha Chedygov
  • 127,549
  • 26
  • 102
  • 115
  • Can't you just use a [plain descriptor](http://docs.sqlalchemy.org/en/latest/orm/mapped_sql_expr.html#using-a-plain-descriptor)? – univerio Sep 14 '16 at 00:32
  • @univerio: No, because I need the counts to be calculated in the same query. If I have a collection of Employers, I don't want to run a query for each one. – Sasha Chedygov Sep 14 '16 at 03:19
  • Did you ask in sqlachemy user group? It is pretty active. – denfromufa Sep 17 '16 at 03:20
  • Can't you just add a calculated column with number of employees? – denfromufa Sep 17 '16 at 03:24
  • @denfromufa: No actually, I hadn't thought of that! I will ask there. As for the calculated column, I can do that, but this is for a legacy codebase and I really want to make as few changes as possible. – Sasha Chedygov Sep 17 '16 at 04:26

1 Answers1

11

The loading strategies that you tried are for relationships. The loading of a column_property is altered in the same way as normal columns, see Deferred Column Loading.

You can defer the loading of employee_count by default by passing deferred=True to column_property. When a column is deferred, a select statement is emitted when the property is accessed.

defer and undefer from sqlalchemy.orm allow this to be changed when constructing a query:

from sqlalchemy.orm import undefer
Department.query.options(undefer('employee_count')).all()
RazerM
  • 5,128
  • 2
  • 25
  • 34
  • Aha! I knew I was close, but I didn't know about deferred column loading. Just tested it out and it works exactly as expected. Thank you! – Sasha Chedygov Sep 18 '16 at 01:07