0

I'm getting crazy over this. I have a Flask app on which I use SQLAlchemy. I want to do what I think is a really simple query:

SELECT * FROM event e
LEFT JOIN event_data ed on e.id = ed.event_id
WHERE ed.datetime > '2018-12-01' 

Here are my models

class Event(db.Model):
    __tablename__ = 'event'
    id = db.Column(db.Integer, primary_key=True)
    #  ... many fields...
    event_data = db.relationship("EventData", back_populates="event", lazy='joined', order_by="desc(EventData.id)")

class EventData(db.Model):
    __tablename__ = 'event_data'
    id = db.Column(db.Integer, primary_key=True)
    event_id = db.Column(db.Integer, db.ForeignKey('event.id'))
    #  ... many fields ...
    event = db.relationship("Event", back_populates="event_data")

When i try

Event.query.filter(EventData.datetime>'2018-11-04')

I get this SQL

SELECT every_column_from_event, every_column_from_event_data 
FROM event_data, event 
LEFT OUTER JOIN event_data AS event_data_1 ON event.id = event_data_1.event_id 
WHERE event_data.datetime > %s 
ORDER BY event_data_1.id DESC

Cartesian Product + OUTER JOIN, which takes forever to run.

I also tried :

Event.query.join(EventData).filter(EventData.datetime>'2018-11-04')

which gives:

SELECT every_column_from_event, every_column_from_event_data 
FROM event 
INNER JOIN event_data ON event.id = event_data.event_id 
LEFT OUTER JOIN event_data AS event_data_1 ON event.id = event_data_1.event_id 
WHERE event_data.datetime > %s 
ORDER BY event_data_1.id DESC

No more cartesian product, but 2 joins, one INNER, one LEFT OUTER.

If i change my loading technique to the default lazy="select", i get the query

SELECT every_column_from_event
FROM event 
INNER JOIN event_data ON event.id = event_data.event_id 
WHERE event_data.datetime > %s

which leads to the N+1 selects problem. And in my case N can be very large, so not an option.

Thanks a lot for any help.

Mathiouss
  • 101
  • 7

2 Answers2

3

Thanks Gunnar and Ilja.

Indeed. Inputing the conditions in the second argument does the trick.

Together with

join(...).options(contains_eager(Event.event_data))

to prevent an extra join.

But I still consider this a workaround as the conditions are part of the "ON" clause and not part of the "WHERE" clause. Anyway, it works. Thanks.

Reference which helped

Mathiouss
  • 101
  • 7
-1

.join() has an optional second param which acts as an on clause. So I believe what you are looking for would be:

Event.query.join(EventData, EventData.datetime>'2018-11-04')
Gunnar Norred
  • 160
  • 2
  • 6