1

I have 2 models:

class Hero(models.Model):
    name = models.CharField(max_length=50)
    age = models.PositiveSmallIntegerField()
    identity = models.TextField(max_length=50)

class Villain(models.Model):
    villain_name = models.CharField(max_length=50)
    age = models.PositiveSmallIntegerField()
    identity = models.TextField(max_length=50)

and I create some test instances:

Hero(name="Superman", age=30, identity="Clark Kent").save()
Hero(name="Iron Man", age=35, identity="Tony Stark").save()
Hero(name="Spider-Man", age=18, identity="Peter Parker").save()

Villain(villain_name="Green Goblin", age=45, identity="Norman Osborn").save()
Villain(villain_name="Red Skull", age=38, identity="Johann Schmidt").save()
Villain(villain_name="Vulture", age=47, identity="Adrian Toomes").save()

Since the Villain model doesn't have a name field, we use annotation before doing a union.

Then, using them in a union produces results where the columns aren't all where they should be:

>>> from django.db.models import F
>>> characters = Hero.objects.all().values('name', 'age', 'identity').union(Villain.objects.all().annotate(name=F("villain_name")).values('name', 'age', 'identity'))
>>> for character in characters:
...     print(character)
{'name': 38, 'age': 'Johann Schmidt', 'identity': 'Red Skull'}
{'name': 45, 'age': 'Norman Osborn', 'identity': 'Green Goblin'}
{'name': 47, 'age': 'Adrian Toomes', 'identity': 'Vulture'}
{'name': 'Iron Man', 'age': 35, 'identity': 'Tony Stark'}
{'name': 'Spider-Man', 'age': 18, 'identity': 'Peter Parker'}
{'name': 'Superman', 'age': 30, 'identity': 'Clark Kent'}

Looking at the raw sql queries, we see this:

>>> str(Hero.objects.all().values('name', 'age', 'identity').query)
'SELECT "myapp_hero"."name", "myapp_hero"."age", "myapp_hero"."identity" FROM "myapp_hero"'

>>> str(Villain.objects.all().annotate(name=F("villain_name")).values('name', 'age', 'identity').query)
'SELECT "myapp_villain"."age", "myapp_villain"."identity", "myapp_villain"."villain_name" AS "name" FROM "myapp_villain"'

The auto-generated sql contains columns that aren't in the same order for the two queries.

How can I make these in the same order, so that the resulting queryset of my union isn't messed up?

PS. Yes, this is very similar to the question asked here, but after a long explanation of why it happens, the end result of the answer to that question was:

please make sure that the ordering of fields in generated SQL is always correct

which helps, but doesn't answer the question of how to fix it.

John
  • 2,551
  • 3
  • 30
  • 55
  • Then why don't you use raw sql with django orm ? Model.objects.raw("your query").union(OtherModel.objects.raw("other query")) – Deepak Tripathi Dec 24 '21 at 18:14
  • @DeepakTripathi same reason the ORM exists, instead of people using raw SQL - it's more convenient. This is an example. My actual tables have a dozen columns, some of which are dynamically generated, and have foreign keys to other tables. I would shoot myself if I had to write SQL by hand for that. I try not to abandon ship every time I run into a minor issue with a system (the ORM in this case). – John Dec 24 '21 at 18:17

1 Answers1

1

From the docs on union

Passing different models works as long as the SELECT list is the same in all QuerySets (at least the types, the names don’t matter as long as the types are in the same order)

When you add the annotation to values() it always comes after the non-annotated columns, however since the names don't matter and you only need the columns to be in the same order you can discard this annotation

Hero.objects.values(
    'name', 'age', 'identity'
).union(Villain.objects.values(
    'villain_name', 'age', 'identity'
))
Iain Shelvington
  • 31,030
  • 3
  • 31
  • 50
  • Interesting workaround. Tested - definitely works. It does however make it a little harder to do abstractions. I was hoping to have a list of columns I'm interested in (`columns = ["name", "age", "identity"]`), and one function per model that would use annotations to ensure that the queryset returned by the function would contain those columns before they are combined with `union`. With this workaround, each model's function would need to have its own hardcoded list of columns - even the common ones - and return them in proper order. Is this workaround the only way to fix this problem? – John Dec 24 '21 at 20:59
  • @John How many models are we talking about? Is there any reason why the `villain_name` field can't be renamed to `name` since you even coerced it in your original solution – Iain Shelvington Dec 24 '21 at 21:04
  • The columns are named after the source files where the data is coming from. The only time they need to have their names coerced like this is on one template, where there's a table that combines data from multiple models. On every other template, these columns need to have their original column names. These are the client's requirements. I'm sure internally I could rename the columns, but it would lead to more confusion, as the columns in the source files wouldn't match the columns in the database anymore. I hope that makes sense. – John Dec 24 '21 at 21:27
  • @John Do I understand correctly, there is only a single view/template where this union is required? How many models and how many fields are going to be in this union because it sounds like something similar to the answer would be just fine so long as it was commented appropriately? If it's a one-off there is no need to create a pattern – Iain Shelvington Dec 24 '21 at 21:50
  • There are 4 models so far. Might be a 5th in the future. 8 fields in the union. You might be right - it might be easier to rename the column and add comments. It's hard to know what's easier to do without knowing more than one option, so I'm looking at my options. Right now, I have your suggestion implemented and working, and the code is fairly easy to read. I don't love that it relies on the position of fields, rather than names, but it works, and I left comments explaining what it's doing. If there's a better way to do it, great. If not, I'll leave it as is. Thanks for your suggestion – John Dec 24 '21 at 22:42