5

I know how to GROUP BY and aggregate:

>>> from expenses.models import Expense
>>> from django.db.models import Sum
>>> qs = Expense.objects.order_by().values("is_fixed").annotate(is_fixed_total=Sum("price"))
>>> qs
<ExpenseQueryset [{'is_fixed': False, 'is_fixed_total': Decimal('1121.74000000000')}, {'is_fixed': True, 'is_fixed_total': Decimal('813.880000000000')}]>

However, If I want to do the same for other two columns, it only returns the last:

>>> qs = (
...     Expense.objects.order_by()
...     .values("is_fixed")
...     .annotate(is_fixed_total=Sum("price"))
...     .values("source")
...     .annotate(source_total=Sum("price"))
...     .values("category")
...     .annotate(category_total=Sum("price"))
... )
>>> qs
<ExpenseQueryset [{'category': 'FOOD', 'category_total': Decimal('33.9000000000000')}, {'category': 'GIFT', 'category_total': Decimal('628')}, {'category': 'HOUSE', 'category_total': Decimal('813.880000000000')}, {'category': 'OTHER', 'category_total': Decimal('307')}, {'category': 'RECREATION', 'category_total': Decimal('100')}, {'category': 'SUPERMARKET', 'category_total': Decimal('52.8400000000000')}]>

It is possible to accomplish what I want with only one query instead of three?

Expected result:

<ExpenseQueryset [{'category': 'FOOD', 'total': Decimal('33.9000000000000')}, {... all other categories ...}, 
{'source': 'MONEY', 'total': Decimal('100')}, {... all other sources ...}, {'is_fixed': False, 'total': Decimal('1121.74000000000')}, {'is_fixed': True, 'total': Decimal('813.880000000000')}]>

Optimally, it could be split into something like:

<ExpenseQueryset ['categories': [{'category': 'FOOD', 'total': Decimal('33.9000000000000')}, {... all other categories ...}], 
'sources': [{'source': 'MONEY', 'total': Decimal('100')}, {... all other sources ...}], 'type': [{'is_fixed': False, 'total': Decimal('1121.74000000000')}, {'is_fixed': True, 'total': Decimal('813.880000000000')}]]>

But this is just a big plus.

Murilo Sitonio
  • 270
  • 7
  • 30
  • chaining multiple fields to values doesn't help? for example: `values('is_fixed', 'source', 'category')`. – Abhyudai Aug 12 '21 at 03:34
  • No, because it will GROUP BY all the fields together: `{'is_fixed': False, 'category': 'FOOD', 'source': 'SETTLE_UP', 'total': Decimal('33.9000000000000')}` – Murilo Sitonio Aug 12 '21 at 11:20
  • What's your expected result? – Crash0v3rrid3 Aug 14 '21 at 08:54
  • @Crash0v3rrid3 I just update the question. – Murilo Sitonio Aug 14 '21 at 13:46
  • 5
    It's not that it cannot be done in Django. I don't see a way you can achieve this in raw SQL. A single query allows you to group by one or more columns, but even when you group by multiple columns, you just tell the database to put all the rows, that have the same value in those columns, into a single group. The only option you have is to split it into three queries. – Crash0v3rrid3 Aug 14 '21 at 14:43
  • as @Crash0v3rrid3 mentioned, i also dont see a way to do this in raw SQL maybe with union all, but that would also imply multiple queries. You could then concatenate the 3 querysets into a single dictionary, but not into a single queryset. Maybe you could have a more complex queryset with a custom ModelManager, but that might be too much effort. what is your end goal? return that ExpenseQueryset in a view? – António Caeiro Aug 17 '21 at 12:04
  • @Alucarder, yeah I'm doing 3 queries now and returning the results in a dict. And yes, my end goal is to return data to a view. – Murilo Sitonio Aug 17 '21 at 19:19

1 Answers1

2

The Answer is NO, Cause it's not possible with SQL

But you can use the below approach combined with python coding:

I don't think it's possible even in raw SQL, Cause in each query you can group by one or multiple fields together, but not with separated results for each. But it's possible to do it with one query and use little python codes to merge results in the format u want. below I described how u can use it step by step. and in the next section wrote a python method that u can dynamically use for any further usages.

How it works

The only simple solution I can mention is that you group by those 3 fields you want, and do a simple python programming to sum results together for each field. In this method, u will have only one query but separate results for each field group-by.

from expenses.models import Expense
from django.db.models import Sum

qs = Expense.objects.order_by().values("is_fixed", "source", "category").annotate(total=Sum("price"))

Now the result will be something like below:

<ExpenseQueryset [{'category': 'FOOD', 'is_fixed': False, 'source': 'MONEY', 'total': Decimal('33.9000000000000')}, { ...}, 

Now we can simple aggregate each field result by iterating on this result

category_keys = []
for q in qs:
    if not q['category'] in category_keys:
        category_keys.append(q['category'])

# Now we have proper values of category in category_keys
category_result = []
for c in category_keys:
    value = sum(item['total'] for item in qs if item['category'] == c)
    category_result.append({'category': c, 'total': value)

And the result for the category field will be something like this:

[{'category': 'FOOD', 'total': 33.3}, {... other category results ...}

Now we can continue and make result for other group by fields is_fixed and source like below:

source_keys = []
for q in qs:
    if not q['source'] in source_keys:
        source_keys.append(q['source'])
source_result = []
for c in source_keys:
    value = sum(item['total'] for item in qs if item['source'] == c)
    source_result.append({'source': c, 'total': value)

is_fixed_keys = []
for q in qs:
    if not q['is_fixed'] in is_fixed_keys:
        source_keys.append(q['is_fixed'])
is_fixed_result = []
for c in is_fixed_keys:
    value = sum(item['total'] for item in qs if item['is_fixed'] == c)
    is_fixed_result.append({'is_fixed': c, 'total': value)

Global solution

Now that we know how to use this solution, here is a function to just give fields you want to it and will make proper results for u dynamically.

def find_group_by_separated_by_keys(key_list):
    """ key_list in this example will be:
        key_list = ['category', 'source', 'is_fixed']
    """
    qs = Expense.objects.order_by().values(*tuple(key_list)).annotate(total=Sum("price"))
    qs = list(qs)
    result = []
    for key in key_list:
        key_values = []
        for item in qs:
            if not item[key] in key_values:
                key_values.append(item[key])
        
        key_result = []
        for v in key_values:
            value = sum(item['total'] for item in qs if item[key] == v)
            key_result.append({key: v, 'total': value})

        result.extend(key_result)
    return result

Now just simple use it like below in ur code:

find_group_by_separated_by_keys(['category', 'source', 'is_fixed')

And it will give a list of values like the proper format u wanted

Reza Torkaman Ahmadi
  • 2,958
  • 2
  • 20
  • 43
  • It's a solution, but I'm just wondering if it's worth it: I have only one query, but I got `4*len(key_list)` loops + greater code complexity. Anyway, thanks! – Murilo Sitonio Aug 19 '21 at 19:58
  • `4*len(key_list)` will not cause much trouble. The only case that causes a problem is when `key_list` is like million choices and have many value options. and code complexity is not a problem cause I provide a ready-to-use function. – Reza Torkaman Ahmadi Aug 19 '21 at 20:43
  • Code complexity is not about the possibility to "plug-and-forget" some code, but also who are going to maintain it, etc... – Murilo Sitonio Aug 19 '21 at 20:54
  • May I know which part is complex maybe it can be written better. And also If this approach is not going to work for you, then this question should be closed, As it never has an approach with SQL to give u the result in the way u want. Cause Django ORM is based on SQL and this operation is not handled with SQL. But u can also w8 for others to give same answer. – Reza Torkaman Ahmadi Aug 19 '21 at 20:58
  • I updated the answer as it's not possible with SQL. But will gladly w8 to see if any other person has an approach with SQL. – Reza Torkaman Ahmadi Aug 19 '21 at 21:01
  • The code is not complex by itself, is just more complex than doing 3 queries. Yes of course that django's ORM can't do magic if it can't be accomplished with pure SQL. I agree it should've been closed by now but I was unable to do so because of the current bounty. – Murilo Sitonio Aug 20 '21 at 13:13
  • so either way I guess the answer I provided should be accepted. Cause I mentioned in detail that it's not possible and I provided the alternative solution with python. – Reza Torkaman Ahmadi Aug 20 '21 at 13:40