I have two tables (foo and bar, each having more than 40000 rows), with one referencing another, with many-to-one, joinedload relationship, I want to get an aggregate sum (based on uid in foo, i.e. many rows in foo table can have same u_id) of two columns in table being referenced by the foo table:
class Foo(Base):
__tablename__ = 'foo'
id = Column(Integer, nullable=False, primary_key=True)
u_id = Column(String(20), nullable=False)
name = Column(String(30), nullable=False)
bar_stats = relationship(
"BarStats", uselist=False, backref="foo", lazy="joined",
cascade="all, delete-orphan")
class BarStats(Base):
__tablename__ = 'bar_stats'
foo_id = Column(Integer, nullable=False, primary_key=True)
in_val = Column(BigInteger, nullable=True)
out_val = Column(BigInteger, nullable=True)
ts = Column(BigInteger, nullable=False)
Now, I want to get sum of in_val and out_val and group them by u_id. I am trying to reduce time for the computation by avoiding latency due to unnecessary computation. Can somebody please tell me if there is a better way than these two:
foo_usage = (BarStats.out_val or 0 + BarStats.in_val or 0).label('usage_by_foo_id') # Barstats does not exist, usage = 0.
stm = session.query(BarStats.foo_id, foo_usage)
stm = stm.filter(foo_usage != None).subquery()
query_foo = session.query(foo.u_id, func.sum(stm.c.usage_by_foo_id).label('id_usage')).\
join(stm).filter(foo.id==stm.c.foo_id).\
group_by(foo.u_id)
query_foo = apply_filters(query_foo, filters)
Another way i can think of is this:
query_foo = session.query(Foo) # Joined load
foos = query_foo.all()
for i in foos:
f = (i.bar_stats.out_val or 0) + (i.bar_stats.in_val or 0) if (i.bar_stats and i.bar_stats.ts) else 0 #ts should also exist.
usage_dict[i.u_id]=usage_dict.get(i.i_id, 0)+f
Can somebody please tell me if there is any better way to do this?