1

Let's say that I have a Python 3.6 list that looks like this:

l1 = [
     [a,b,c], 
     [b,c], 
     [c], 
     [d, e], 
     [e]
     ...
] 

I need to convert this to a tree-like structure using anytree, so that it looks like this:

>>> print(RenderTree(l1))

l1
|__ a
|   |__b
|      |__c
|___d
    |__e

Consider the objects a, b, c, d, e to be a string if that helps in any way. I've currently read a lot of the documentation for anytree and searched for a while on StackOverflow, but couldn't find anything that would help me resolve that issue. What's the most pythonic way I can resolve this issue?

Edit: To add clarification, the original list l1 is supposed to represent a tree, where the first element in l1 is the parent node, and each node inside it is a child node. Each child node can be a child node of the node before it, and so on

Edit Edit: So, here's what the original list would (hypothetically) look like:

l1 = [
['a', 'b', 'c'],
['b', 'c'],
['c'],
['d', 'e'],
['e']
]

Here, the first element for each sublist is always going to end up being a parent for that branch. Joining each of those branches together would result in me getting the format I need, but I've been struggling to put it into words (it's 2 am here). Here are some of my attempts:

For converting the list into nodes:

from anytree import Node

l = []

for x in l1:
    a = Node(x[0])
    for i in x[1:]:
        Node(i, parent = a)
    l.append(a)

However, this returns a tree/list so:


>>> l
[Node('/a'), Node('/b'), Node('/c'), Node('/d'), Node('/e')]
>>> print(RenderTree(l[0]))
Node('/a')
├── Node('/a/b')
└── Node('/a/c')
>>> print(RenderTree(l[1]))
Node('/b')
└── Node('/b/c')
>>> print(RenderTree(l[2]))
Node('/c')
>>> print(RenderTree(l[3]))
Node('/d')
└── Node('/d/e')
>>> print(RenderTree(l[4]))
Node('/e')

To filter this out, I tried to do the following:

def tuple_replace(tup, pos, val):
    return tup[:pos] + (val,) + tup[pos+1:]

>>> l2=[]
>>> for pos, x in enumerate(l):
    for pos_2, i in enumerate(x.children):
        for j in l[pos+1:]:
            if j.name == i.name:
                x.children = tuple_replace(x.children, pos_2, i)
                break
        l2.append(x)

>>> for x in l2:
    print(RenderTree(x))


Node('/a')
├── Node('/a/b')
└── Node('/a/c')
Node('/a')
├── Node('/a/b')
└── Node('/a/c')
Node('/b')
└── Node('/b/c')
Node('/d')
└── Node('/d/e')

That's the step that I'm currently at

Edit Edit edit:

So, the way the tree is represented is that I have a function that returns a list like l1, and has the following logic behind it:

Each element in the list has 2 parts. The parent, and the children. The parent is the first element in the list, and everything else is its children, or it's children's children and so on. So an element like: [a, b, c] and [d, e, f, g] represents all the elements in the branch, not just the immediate parents that keep going down. That's where the rest of the elements come into play. The next element usually contains the parent's first child: [b, c] and [e, f] and [g]. But now, the element [d, e, f, g] is different from [a, b, c] because there are 2 different sub-branches inside it instead of one. So, a tree such as this:

l1
|
|_a
|   |__b
|   |__c
|
|_d
   |__e
   |    |__f
   |__g

Would be described as:

Edit: fixed the input tree, because f didn't have a stand alone branch

l1=[
 [a,b,c],
 [b, c],
 [c],
 [d,e,f,g],
 [e,f]
 [f]
 [g]
]
Ali Abbas
  • 136
  • 1
  • 8
  • Can you post your code as a [mcve]? – ggorlen Sep 22 '19 at 05:17
  • 1
    How would I do that? The issue is that the problem is quite abstract on my end too. There might be some other conditions that are implied as well, that I'll add in a moment – Ali Abbas Sep 22 '19 at 05:31
  • 1
    Hmm, I definitely agree with you @MadPhysicist. I'm fairly new to posting questions on StackOverflow. How do you suggest I elaborate on the problem? – Ali Abbas Sep 22 '19 at 05:49
  • You could import the library and show an attempt at organizing the list into a format that the library accepts. From the docs, it looks like it uses `marc = Node("Marc", parent=udo)` format. I'm not really clear on how `l1` represents a tree even after the explanation. Can you clarify the actual structure using a diagram? Thanks. – ggorlen Sep 22 '19 at 06:06
  • Thanks @ggorlen, I'll add those details into the question immediately. Give me 5-10 mins – Ali Abbas Sep 22 '19 at 06:10

1 Answers1

1

You can use recursion to build a nested dictionary to represent your tree, and then traverse the result to print the desired diagram:

from functools import reduce
data = [['a', 'b', 'c'], ['b', 'c'], ['c'], ['d', 'e'], ['e']]
new_data = [a for i, a in enumerate(data) if all(a[0] not in c for c in data[:i])]
def to_tree(d):
   return d[0] if len(d) == 1 else {d[0]:to_tree(d[1:])}

tree = reduce(lambda x, y:{**x, **y}, [to_tree(i) for i in new_data])

Now, to print the structure:

import re
def print_tree(d, c = 0):
   for a, b in d.items():
     yield f'{"|" if c else ""}{"   "*c}|__{a}'
     if not isinstance(b, dict):
        yield f'{"|" if (c+1) else ""}{"   "*(c+1)}|__{b}'
     else:
        yield from print_tree(b, c+1)

*r, _r = print_tree(tree)
print('l1\n{}\n{}'.format('\n'.join(r), re.sub("^\|", "", _r)))

Output:

l1
|__a
|  |__b
|     |__c
|__d
   |__e

Edit: optional tree formation approach:

The current to_tree method assumes that the parent-child node structure will all be included as a single list for each parent node i.e ['a', 'b', 'c'] is a complete path of the tree and ['d', 'e'] is a complete path as well. If it is possible that this may not be the case for future inputs, you can use the code below to build the dictionaries:

def to_tree(d, s, seen = []):
   _l = [b for a, b, *_ in d if a == s and b not in seen]
   return s if not _l else {s:to_tree(d, _l[0], seen+[s, _l[0]])}

data = [['a', 'b', 'c'], ['b', 'c'], ['c'], ['d', 'e'], ['e']]
p = [a[0] for i, a in enumerate(data) if all(a[0] not in c for c in data[:i])]
c = [i for i in data if len(i) > 1]
tree = reduce(lambda x, y:{**x, **y}, [to_tree(c, i) for i in p])
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
  • Thanks for the answer @Ajax1234. I tried out your solution, and it worked like a charm. However, when I gave it another test case `data = [['a', 'b', 'c'], ['b', 'c'], ['c'], ['d', 'e', 'f', 'g'], ['e', 'f'], ['g']]`, the tree was returned as follows: `{'a': {'b': 'c'}, 'd': {'e': {'f': 'g'}}}`, instead of `{'a': {'b': 'c'}, 'd': {'e': 'f', 'g'}}`. When represented as a tree, it'll be something like: l1 |__a | |__b | |__c |__d | |__e | |__f | |__g – Ali Abbas Sep 22 '19 at 15:56
  • @AliAbbas Thank you for your comment. `{'a': {'b': 'c'}, 'd': {'e': 'f', 'g'}}` is an invalid dictionary, as `'g'` needs to be paired as a key/value. Can you clarify how your structure should be created? Did you mean that the result should be `{'a': {'b': 'c'}, 'd': {'e':[ 'f', 'g']}}`? – Ajax1234 Sep 22 '19 at 15:59
  • So @Ajax1234, I was aware that it was invalid, but didn't have time to simplify: the result should be something like: `{'a': {'b': 'c'}, 'd': {'e':[ 'f'], 'g':''}}`. I.e: the element `g` should be a child under d, not e, being on the same level as e. Does that make sense? – Ali Abbas Sep 22 '19 at 16:04
  • @AliAbbas The rules for the tree structuring are rather unclear, for instance, you structure `['a', 'b', 'c']` as `{'a': {'b': 'c'}}` (sequence-nested), however, `['d', 'e', 'f', 'g']` becomes `{'d': {'e':[ 'f'], 'g':''}}`. What determines whether or not the last element comes a nested child or a parent with an empty list? – Ajax1234 Sep 22 '19 at 16:09
  • Yes, I agree that they're currently a bit unclear. I'll simplify them in the post too: So, the way the tree is represented is that I have a function that returns a list like `l1`, and has the following logic behind it: Each element in the list has 2 parts. The parent, and the children. The parent is the first element in the list, and everything else is its children, or it's children's children and so on. So an element like: `[a, b, c]` and represents all – Ali Abbas Sep 22 '19 at 16:19
  • Yes, I agree that they're currently a bit unclear. I've simplified the post as it got too long to write here – Ali Abbas Sep 22 '19 at 16:38
  • @AliAbbas Thank you very much. I am rather pressed for time, however, I will have a complete solution posted shortly. Thank you again for your clarification. – Ajax1234 Sep 22 '19 at 16:39
  • No problem @Ajax1234, thanks again for everything. Feel free to ask any more questions – Ali Abbas Sep 22 '19 at 16:41