0

I have a problem with a very nested dictionary in Python. It is of the following example:

mydict = {'Carcompany': {'Brand': 'Ford',
  'Factory': {'Country': 'Germany',
   'Workers': {'Engineers': '150', 'Mechanics': '200'},
   'FactoryDetails': [{'FactoryCode': '1',
     'FactoryCity': 'Cologne',
     'FactoryCityDetail': [{'Shifts': 'Day',
       'Days': '7'},
      {'Shifts': 'Night', 'Days': '5'}]},
    {'FactoryCode': '2',
     'FactoryCity': 'Berlin',
     'FactoryCityDetail': {'Shifts': 'Night',
      'Days': '5'}}]}}}

So there are values containing list of dicts containing values with lists of dicts again. The depth varies and my goal is an excel spreadsheet or a relational database table like this:

enter image description here

I have tried to flatten the dictionary by pd.io.json.json_normalize() to get a pandas dataframe, find the lists, explode the columns and flatten them again. This solution is really slow and fails with deeper and more nested situations.

My idea is to use a recursive approach, but everything I have found on stackoverflow so far fails when it comes to lists of dictionaries (because of updating dict values) or return enumerated single lines for each value.

pscl
  • 15
  • 1
  • 4

1 Answers1

1

Not sure if you've looked into flatten_json Flatten JSON. I used it to write a recursive function for my complex JSON - flatten_json recursive flattening function for lists

NYC Coder
  • 7,424
  • 2
  • 11
  • 24