2

Am building a tree selector, I need to structure my data like a tree of grouped items. I have bellow input which is a list of dictionaries.

data = [
        {'region': 'R1', 'group': 'G1', 'category': 'C1', 'item': 'I2'},
        {'region': 'R1', 'group': 'G1', 'category': 'C1', 'item': 'I1'},
        {'region': 'R1', 'group': 'G2', 'category': 'C2', 'item': 'I3'},
        {'region': 'R2', 'group': 'G1', 'category': 'C1', 'item': 'I1'},
        {'region': 'R2', 'group': 'G2', 'category': 'C2', 'item': 'I3'},
        {'region': 'R2', 'group': 'G2', 'category': 'C2', 'item': 'I4'},
        {'region': 'R2', 'group': 'G2', 'category': 'C3', 'item': 'I5'},
    ]

I want to get the following output

result = {
  "regions": [
    {
      "name": "R1",
      "groups": [
        {
          "name": "G1",
          "categories": [
            {"name": "C1","items": [{ "name": "I2"},{"name": "I1"}]}
          ]
        },
        {
          "name": "G2",
          "categories": [
            {"name": "C2", "items": [{"name": "I3"}]}
          ]
        }
      ]
    },
    {
      "name": "R2",
      "groups": [
        {
          "name": "G1",
          "categories": [
            {"name": "C1","items": [{"name": "I1"}]}
          ]
        },
        {
          "name": "G2",
          "categories": [
            {"name": "C2","items": [{"name": "I3"},{"name": "I4"}]},
            {"name": "C3", "items": [{"name": "I5"}]}
          ]
        }
      ]
    }
  ]
}

After some researches I come up with this solution

from collections import OrderedDict

d = OrderedDict()
    for aggr in data:
        d.setdefault(
            key=(aggr['region'], aggr['group'], aggr['category']),
            default=list()
        ).append({"name": aggr['item']})
    d1 = OrderedDict()
    for k, v in d.items():
        d1.setdefault(
            key=(k[0], k[1]),
            default=list()
        ).append({"name": k[2], "items": v})
    d2 = OrderedDict()
    for k, v in d1.items():
        d2.setdefault(
            key=k[0],
            default=list()
        ).append({"name": k[1], "categories": v})
    result = {"regions": [{"name": k, "groups": v} for k, v in d2.items()]}

It's working but I believe it's not the most pythonic solution. I did not manage to simplify it.

Any help to propose another solution or improvement on above codes will be appreciated

Rukamakama
  • 780
  • 1
  • 8
  • 16

1 Answers1

2

As long as the items are sorted, like in your example, you could use groupby from itertools in a recursive function, like:

from itertools import groupby
from operator import itemgetter

def plural(word):
    return f"{word}s" if word[-1] != 'y' else f"{word[:-1]}ies"

def grouping(records, *keys):
    if len(keys) == 1:
        return [{"name": record[keys[0]]} for record in records]
    return [
        {"name": key, plural(keys[1]): grouping(group, *keys[1:])}
        for key, group in groupby(records, itemgetter(keys[0]))
    ]

result = {"regions": grouping(data, "region", "group", "category", "item")}

If the sorting isn't guaranteed, then you could adjust grouping in the following way

def grouping(records, *keys):
    if len(keys) == 1:
        return [{"name": record[keys[0]]} for record in records]
    key_func = itemgetter(keys[0])
    records = sorted(records, key=key_func)
    return [
        {"name": key, plural(keys[1]): grouping(group, *keys[1:])}
        for key, group in groupby(records, key_func)
    ]

or sort the data beforehand

keys = ["region", "group", "category", "item"]
data = sorted(data, key=itemgetter(*keys))
result = {"regions": grouping(data, *keys)}

Result of first version for data as provided in the question:

result = {
   "regions": [
      {
         "name": "R1",
         "groups": [
            {
               "name": "G1",
               "categories": [
                  {"name": "C1", "items": [{"name": "I2"}, {"name": "I1"}]
                  }
               ]
            },
            {
               "name": "G2",
               "categories": [
                  {"name": "C2", "items": [{"name": "I3"}]}
               ]
            }
         ]
      },
      {
         "name": "R2",
         "groups": [
            {
               "name": "G1",
               "categories": [
                  {"name": "C1", "items": [{"name": "I1"}]}
               ]
            },
            {
               "name": "G2",
               "categories": [
                   {"name": "C2", "items": [{"name": "I3"}, {"name": "I4"}]},
                   {"name": "C3", "items": [{"name": "I5"}]}
               ]
            }
         ]
      }
   ]
}
Timus
  • 10,974
  • 5
  • 14
  • 28
  • Indeed your answer is really pythonic like it simplicity. Thank you for taking time to help. However it does not output the exact needed structure. And it because it aggregates only on the last element and truncate the middle nodes. In fact the result of above code does not output the last entry of the data `{'region': 'R2', 'group': 'G2', 'category': 'C3', 'item': 'I5'}` The aggregation should be on all intermediate levels not only on last (items) – Rukamakama Dec 10 '21 at 17:09
  • @Rukamakama Thanks for the feedback. I'm a bit surprised: The result does match your expected output exactly? – Timus Dec 10 '21 at 17:26
  • 1
    am really sorry. The issue is on my side I used a different data input from the one I posted. Indeed your answer is correct. I have to upvote it. Much thanks – Rukamakama Dec 10 '21 at 18:04
  • @Rukamakama No probs! Just realized that I forgot to upvote your question: Very interesting one! – Timus Dec 10 '21 at 18:37