0

I have a complex list (json file) and I am trying to flatten it in order to export it to a CSV file.

Some notes:

  • The number of the objects in some nests may vary
  • Some values can be null (empty)
  • The data is a key, therefore I need to manipulate it just like I did a while ago in a previous question but in Javascript: LINK

My JSON is given below:

[
    {
        "masterName": "FirstOne",
        "mainNames": [
            {
                "numbers": {},
                "Name": "PlacedValue1",
                "Type": "zzz"
            },
            {
                "numbers": {
                    "2019-05-17T00:00:00Z": {
                        "NumberOne": 2.0,
                        "NumberTwo": 0.0
                    },
                    "2019-05-29T00:00:00Z": {
                        "NumberOne": 89153.0,
                        "NumberTwo": 18.0
                    },
                    "2019-05-30T00:00:00Z": {
                        "NumberOne": 14.0,
                        "NumberTwo": 0.0
                    }
                },
                "Name": "PlacedValue2",
                "Type": "zzz"
            },
            {
                "numbers": {
                    "2019-05-29T00:00:00Z": {
                        "NumberOne": 219737.0,
                        "NumberTwo": 85.0
                    },
                    "2019-05-30T00:00:00Z": {
                        "NumberOne": 261415.0,
                        "NumberTwo": 116.0
                    }
                },
                "Name": "PlacedValue3",
                "Type": "zzz"
            }
        ]
    },
    {
        "masterName": "SecondOne",
        "mainNames": [
            {
                "numbers": {
                    "2019-05-17T00:00:00Z": {
                        "NumberOne": 2.0,
                        "NumberTwo": 0.0
                    },
                    "2019-05-29T00:00:00Z": {
                        "NumberOne": 89153.0,
                        "NumberTwo": 18.0
                    }
                },
                "Name": "PlacedValue3",
                "Type": "zzz"
            },
            {
                "numbers": {
                    "2019-05-29T00:00:00Z": {
                        "NumberOne": 219737.0,
                        "NumberTwo": 85.0
                    }
                },
                "Name": "PlacedValue4",
                "Type": "zzz"
            }
        ]
    }
]

I am trying to now focus on passing the dictionary key (date in this case) as an element in the equivalent dictionary. For instance, I want this:

                "2019-05-17T00:00:00Z": {
                    "NumberOne": 2.0,
                    "NumberTwo": 0.0
                }

to be:

                {
                    "date" : "2019-05-17T00:00:00Z"
                    "NumberOne": 2.0,
                    "NumberTwo": 0.0
                }

but that fails so far (comments in the code):

json_array2 = { "2019-05-19T00:00:00Z": { "one": 185, }, "2019-04-25T00:00:00Z": { "two": 207, } }

#The idea is to add the date as a new item in the dictionary as a start
for v in json_array2:
    key = v
    json_array['date'] = v

print(json_array2)
Datacrawler
  • 2,780
  • 8
  • 46
  • 100
  • So you want all the `"numbers"` to have multiple dates as keys? That's not possible. The keys all need to be unique. Basically, what would the new key be? – Akaisteph7 Jul 23 '19 at 21:01
  • @Akaisteph7 The ["mainNames"][0] have no keys either. I want to have an object with the arrays that will include the NumberOne, NumberTwo and the newly created field Date – Datacrawler Jul 23 '19 at 21:14

2 Answers2

1

For each individual one of those number dicts, this will change the nested dict structure into a list of dicts:

numbers_dict = {
    "2019-05-17T00:00:00Z": {
        "NumberOne": 2.0,
        "NumberTwo": 0.0
    },
    "2019-05-29T00:00:00Z": {
        "NumberOne": 89153.0,
        "NumberTwo": 18.0
    },
    "2019-05-30T00:00:00Z": {
        "NumberOne": 14.0,
        "NumberTwo": 0.0
    }
}

numbers_dict = [{"date":key, **value} for key, value in numbers_dict.items()]

After which, you can apply this to whichever fields in your JSON you want, in the usual way. Example:

for item in my_json:
    for name in item['mainNames']:
        name['numbers'] = [{"date":key, **value} for key, value in name['numbers'].items()]
Green Cloak Guy
  • 23,793
  • 4
  • 33
  • 53
  • That didn't work. No matter what value I put in the **value, it returns an error. – Datacrawler Jul 23 '19 at 21:10
  • What error is it returning, and what version of python are you running? This should work on python 3.5+ - the `**` is the dict unpacking operator, and is referring to the `value` named in the list comprehension. – Green Cloak Guy Jul 23 '19 at 21:16
  • Surprisingly I was using version 2.7.15. I was sure I had 3+. – Datacrawler Jul 23 '19 at 21:32
0

This script will change all keys numbers from dict to list of dicts and put the dates as keys in those dicts:

data = [
    {
        "masterName": "FirstOne",
        "mainNames": [
            {
                "numbers": {},
                "Name": "PlacedValue1",
                "Type": "zzz"
            },
            {
                "numbers": {
                    "2019-05-17T00:00:00Z": {
                        "NumberOne": 2.0,
                        "NumberTwo": 0.0
                    },
                    "2019-05-29T00:00:00Z": {
                        "NumberOne": 89153.0,
                        "NumberTwo": 18.0
                    },
                    "2019-05-30T00:00:00Z": {
                        "NumberOne": 14.0,
                        "NumberTwo": 0.0
                    }
                },
                "Name": "PlacedValue2",
                "Type": "zzz"
            },
            {
                "numbers": {
                    "2019-05-29T00:00:00Z": {
                        "NumberOne": 219737.0,
                        "NumberTwo": 85.0
                    },
                    "2019-05-30T00:00:00Z": {
                        "NumberOne": 261415.0,
                        "NumberTwo": 116.0
                    }
                },
                "Name": "PlacedValue3",
                "Type": "zzz"
            }
        ]
    },
    {
        "masterName": "SecondOne",
        "mainNames": [
            {
                "numbers": {
                    "2019-05-17T00:00:00Z": {
                        "NumberOne": 2.0,
                        "NumberTwo": 0.0
                    },
                    "2019-05-29T00:00:00Z": {
                        "NumberOne": 89153.0,
                        "NumberTwo": 18.0
                    }
                },
                "Name": "PlacedValue3",
                "Type": "zzz"
            },
            {
                "numbers": {
                    "2019-05-29T00:00:00Z": {
                        "NumberOne": 219737.0,
                        "NumberTwo": 85.0
                    }
                },
                "Name": "PlacedValue4",
                "Type": "zzz"
            }
        ]
    }
]

import json

def change_keys(d):
    for k, v in d.items():
        if k=='numbers':
            new_v = []
            for kk, vv in v.items():
                new_v.append({'date': kk})
                new_v[-1].update(**vv)
            new_d = {'numbers':new_v}
            d2 = d.copy()
            del d2['numbers']
            new_d.update(**d2)
            return new_d
    return d

new_d = json.loads(json.dumps(data), object_hook=change_keys)

print(json.dumps(new_d, indent=4))

Prints:

[
    {
        "masterName": "FirstOne",
        "mainNames": [
            {
                "numbers": [],
                "Name": "PlacedValue1",
                "Type": "zzz"
            },
            {
                "numbers": [
                    {
                        "date": "2019-05-17T00:00:00Z",
                        "NumberOne": 2.0,
                        "NumberTwo": 0.0
                    },
                    {
                        "date": "2019-05-29T00:00:00Z",
                        "NumberOne": 89153.0,
                        "NumberTwo": 18.0
                    },
                    {
                        "date": "2019-05-30T00:00:00Z",
                        "NumberOne": 14.0,
                        "NumberTwo": 0.0
                    }
                ],
                "Name": "PlacedValue2",
                "Type": "zzz"
            },
            {
                "numbers": [
                    {
                        "date": "2019-05-29T00:00:00Z",
                        "NumberOne": 219737.0,
                        "NumberTwo": 85.0
                    },
                    {
                        "date": "2019-05-30T00:00:00Z",
                        "NumberOne": 261415.0,
                        "NumberTwo": 116.0
                    }
                ],
                "Name": "PlacedValue3",
                "Type": "zzz"
            }
        ]
    },
    {
        "masterName": "SecondOne",
        "mainNames": [
            {
                "numbers": [
                    {
                        "date": "2019-05-17T00:00:00Z",
                        "NumberOne": 2.0,
                        "NumberTwo": 0.0
                    },
                    {
                        "date": "2019-05-29T00:00:00Z",
                        "NumberOne": 89153.0,
                        "NumberTwo": 18.0
                    }
                ],
                "Name": "PlacedValue3",
                "Type": "zzz"
            },
            {
                "numbers": [
                    {
                        "date": "2019-05-29T00:00:00Z",
                        "NumberOne": 219737.0,
                        "NumberTwo": 85.0
                    }
                ],
                "Name": "PlacedValue4",
                "Type": "zzz"
            }
        ]
    }
]

EDIT: Preserve "Name" and "Type" keys also.

Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • I just noticed that it doesn't work (without getting any error) when there numbers are empty (e.g. `"numbers": {}`). Should I exclude all those parent branches from the string or is it a quick fix in your code? – Datacrawler Jul 24 '19 at 09:15
  • 1
    @ApoloRadomer When `"numbers": {}` the script converts it to empty list `"numbers": []`. If you don't want empty lists, you could exclude it -> don't call `return new_d` in `change_keys()` in this case – Andrej Kesely Jul 24 '19 at 09:20