I'd like to build a pandas dataframe or tuple from an anytree object, where each node has a list attribute of members:
from anytree import Node, RenderTree, find_by_attr
from anytree.exporter import DictExporter
from collections import OrderedDict
import pandas as pd
import numpy as np
tree = Node('T0C0',
n=1000,
tier=0,
members=['A','B','C','D'])
Node('T0C0.T1C0',
parent=find_by_attr(tree, 'T0C0'),
n=400,
tier=1,
members=['B','C'])
Node('T0C0.T1C1',
parent=find_by_attr(tree, 'T0C0'),
n=600,
tier=1,
members=['A','D'])
Node('T0C0.T1C1.T2C0',
parent=find_by_attr(tree, 'T0C0.T1C1'),
n=300,
tier=2,
members=['D'])
Node('T0C0.T1C1.T2C1',
parent=find_by_attr(tree, 'T0C0.T1C1'),
n=300,
tier=2,
members=['A'])
my goal is to produce a dataframe of end-nodes per member, or, even better, tier membership per column like the following:
pd.DataFrame(data=np.array([['T0C0.T1C1.T2C1','T0C0.T1C0','T0C0.T1C0','T0C0.T1C1.T2C0'],
['T0C0','T0C0','T0C0','T0C0'],
['T0C0.T1C1','T0C0.T1C0','T0C0.T1C0','T0C0.T1C1'],
['T0C0.T1C1.T2C1',None,None,'T0C0.T1C1.T2C0']]
),
index=['A','B','C','D'],columns=['EndCluster','tier0','tier1','tier2'])
I've tried exporting to ordereddict and to json and building data frames directly from there, but "children" becomes a column in the resulting dataframe, with ordered dict entries. I cannot find a way to unnest. Thank you for any help!