1

I have the following Table model representing a timeline.

class TimeRange(Base):


    __tablename__ = "time_line"


    record_id = Column(Integer, primary_key=True)
    level = Column(String, nullable=False) # e.g. "Point", "Range"
    content = Column(String, nullable=False)
    language_marker = Column(String) # this one column is optional and needs to be queried
    immediate_parent_id = Column(Integer, ForeignKey('time_line.record_id'))
    child_timelines = relationship('TimeRange', backref=backref('parent_timeline', remote_side=[record_id]))

The language_marker Column is the one that needs to be queried in a recursive manner. Not all records have such an attribute, and the business logic is: along the hierarchy lineage from the root down to the child timelines, at least one level of the TimeRange instance carries such an attribute, and the one in the lowest level should be returned. This works a little like cascading style sheet, where if the TimeRange object itself doesn't have such an attribute, just look further up one level above, util found one, and the latest defined style wins.

What is the technical direction I should look into to implement such queries? I'm using SQLAlchemy and the backend is SQLite. Thanks.

xli
  • 1,298
  • 1
  • 9
  • 30
Jinghui Niu
  • 990
  • 11
  • 28
  • Is it ok performance-wise to do a select query for the parent `TimeRange` whenever `language_marker` is `None`? – xli Oct 15 '16 at 13:39
  • @xli, thanks for your reply. It's absolutely ok. Now I'm clueless about the technical direction here. Could you give more details? – Jinghui Niu Oct 15 '16 at 19:56

1 Answers1

1

I would add a property derived_language_marker that starts with the current object and goes up the hierarchy of parent TimeRanges until it finds a language_marker that is not None and returns it:

 @property
 def derived_language_marker(self):
     time_range = self
     while not time_range.language_marker and time_range.parent_timeline:
         time_range = time_range.parent_timeline
     return time_range.language_marker

It can be accessed just as time_range.derived_language_marker, but the drawback is that it could result in a new SELECT query for each level it checks if the parent objects are not already loaded.

(If you are doing this for many TimeRanges and performance becomes a problem you can eager-load ancestors up to a certain depth with joins as described here as the long as the maximum possible depth is not too extreme. For further optimization, there is also this approach for SQLite.)

xli
  • 1,298
  • 1
  • 9
  • 30
  • one more plus for the conciseness and usefulness. I wish I could add one more the elegance. – Jinghui Niu Oct 15 '16 at 21:47
  • could you give a little more explanation why using property and not hybrid_property, or even plain attributes directly on TimeRange? Thanks. – Jinghui Niu Oct 15 '16 at 21:50
  • 1
    `hybrid_property` would allow you to make a query on `derived_language_marker` (like `filter(derived_language_marker == some_value)`), but it would need to be possible to convert the logic into a SQL expression, which isn't possible as it currently is. If you need functionality like that, you could try looking at the second link. With a plain attribute, you would need to create the column in the table, but then you would need to keep that column continually updated. – xli Oct 15 '16 at 22:05