1

I want to analyze data from the ArangoDB. These data are available as a tree structure. I want to analyze these data with Pandas now. I used Pandas before, but these datasets were all in a different structure, e.g. name, date, price, ... (all in one line like a CSV). You can find below an example of what my dataset looks like.

  • What is the best option to analyze these data?
  • How can I break down the data set to a 'normal' structure e.g. CSV?

What my data looks like

└───dataset
    ├───createdAt
    ├───currency
    ├───date
    ├───lineItems
    │   ├───createdAt
    │   ├───customer
    │   │   ├───id
    │   │   └───plant
    │   ├───id
    │   ├───price
    │   └───unit
    ├───metaData
    │   └───originSystem
    ├───netPrice
    │   └───0
    │       └───netPrice
    └───payment
        ├───adress
        │   ├───name
        │   └───street
        └───number

I know that pandas.json_normalize exists, but unfortunately, the dataset is more complex and I have more than one dataset with a tree structure to analyze.

Example

import pandas as pd
df=pd.json_normalize(result['dataset']['lineItems'])

# I could get the dataset as a dict
# dict_arangodb = ArangoDB(...)
# ...
# df = pd.json_normalize(dict_arangodb)
seldesjo
  • 169
  • 1
  • 3
Test
  • 571
  • 13
  • 32
  • Analyze how exactly? Simply keeping it as a tree is probably a good solution for most scenarios. – tripleee Apr 11 '22 at 09:07
  • Guessing a bit as to what those labels mean, perhaps break out `lineItems` and maybe `payment` into separate data frames. – tripleee Apr 11 '22 at 09:08
  • @tripleee like using some different charts. And is there already a solution to break out automatically such a tree strucutre? – Test Apr 11 '22 at 09:29
  • I'm not sure what you mean by "automatically". I would not expect for there to be a standard mechanism in Python or Pandas to figure out the structure of arbitrary data without any guidance from you. – tripleee Apr 11 '22 at 09:44

0 Answers0