0

as a part of a project I'm working on, I'm trying to build an hierarchical data structure of objects from different types.

I used django-mptt for it, which promises to handle trees in a smart way with fast queries.

The problem is, that I have multiple models that needs to participate in this data tree, so I used generic relation in order to store the required data.

A snippet of the model I built:

class CPNode(MPTTModel):
    content_type = models.ForeignKey(ContentType, null=True, blank=True)
    object_id = models.PositiveIntegerField(null=True)
    content_object = GenericForeignKey('content_type', 'object_id')
    parent = TreeForeignKey('self', null=True, blank=True, related_name='children', db_index=True)
...

This gives me what i want, except of the query issue.

I figure that to query all the data will cost multiple queries (each time I would like to get the content_object itself).

Does anyone has an idea of how I can maintain this structure, and at the same time being able to get all the data in a scalable query?

Arpit Solanki
  • 9,567
  • 3
  • 41
  • 57
Dar Ben-Tov
  • 111
  • 7
  • what is the exact problem you're having? You've chosen to use relational database for this, so you can't avoid having some JOINs in these types of queries. Use [`prefetch_related`](https://docs.djangoproject.com/en/1.11/ref/models/querysets/#django.db.models.query.QuerySet.prefetch_related) if you need to directly fetch `content_object` in your queries, to avoid multiple queries. Whether it's scalable depends on your data. A NoSQL DB might be better suited for this kind of tree. – dirkgroten Aug 21 '17 at 13:21
  • The problem is, that big part of this data is already in production. so shifting to NoSQL DB is my last resort. I haven't started yet, but thinking ahead, it seems like querying a big tree with multiple types of nodes will be a mess. I have multiple type of assets which each type is maintained by a model, and each asset can have asset below of vary types. – Dar Ben-Tov Aug 21 '17 at 13:26

1 Answers1

0

Relational databases (well SQL databases at least) are not that great when it comes to heterogenous trees... MPTT will indeed vastly improve reading perfs by avoiding the need for recursive queries, but this won't solve the GenericForeignKey hack - there's just no way to implement such a feature at the SQL level so if you use it, yes, you WILL need one more query per node to get the effective content.

The only way to avoid those extra queries would be to cram every field of every node subtype in the same model, add a "node_type" field to it and use per-node-type proxy models. The code won't be pretty (been here, done that) but well, there are few other options here...

bruno desthuilliers
  • 75,974
  • 6
  • 88
  • 118