Your expected output is not valid. I would probably go for a nested dictionary like this instead:
{
"Germany": {
"cases": {
"last_cases_value": 6025,
"updated_cases_value": 6026,
"change": 1
},
"active": {
"last_active_value": 100027,
"updated_active_value": 100026,
"change": -1
},
"deaths": {
"last_deaths_value": 1704,
"updated_deaths_value": 1706,
"change": 2
}
},
"Australia": {
"cases": {
"last_cases_value": 3045,
"updated_cases_value": 3046,
"change": 1
},
"active": {
"last_active_value": 100027,
"updated_active_value": 100028,
"change": 1
}
}
}
To get the above, I would first convert your list of dictionaries into nested dictionaries where 'country'
is the key:
last_data = [{'country': 'USA', 'cases': 10425, 'deaths': 1704, 'recovered': 2525, 'active': 100027},
{'country': 'Australia', 'cases': 3045, 'deaths': 1704, 'recovered': 2525, 'active': 100027},
{'country': 'Germany', 'cases': 6025, 'deaths': 1704, 'recovered': 2525, 'active': 100027}]
current_data = [{'country': 'USA', 'cases': 10425, 'deaths': 1704, 'recovered': 2525, 'active': 100027},
{'country': 'Australia', 'cases': 3046, 'deaths': 1704, 'recovered': 2525, 'active': 100028},
{'country': 'Germany', 'cases': 6026, 'deaths': 1706, 'recovered': 2525, 'active': 100026}]
def list_dicts_to_nested_dict(key, lst):
return {dic[key]: {k: v for k, v in dic.items() if k != key} for dic in lst}
last_data_dict = list_dicts_to_nested_dict('country', last_data)
# {'USA': {'cases': 10425, 'deaths': 1704, 'recovered': 2525, 'active': 100027}, 'Australia': {'cases': 3045, 'deaths': 1704, 'recovered': 2525, 'active': 100027}, 'Germany': {'cases': 6025, 'deaths': 1704, 'recovered': 2525, 'active': 100027}}
current_data_dict = list_dicts_to_nested_dict('country', current_data)
# {'USA': {'cases': 10425, 'deaths': 1704, 'recovered': 2525, 'active': 100027}, 'Australia': {'cases': 3046, 'deaths': 1704, 'recovered': 2525, 'active': 100028}, 'Germany': {'cases': 6026, 'deaths': 1706, 'recovered': 2525, 'active': 100026}}
Which is also a good idea because searching for a specific countries data will be O(1) instead of O(N) from scanning the whole dictionary. It also makes it easier to intersect the countries in the future, which I will show below.
Then add the changed data to a nested two-depth collections.defaultdict
of dict
, since it handles initializing new keys for you. You can have a look at this Nested defaultdict of defaultdict answer for more information and other ways of doing this.
result = defaultdict(lambda: defaultdict(dict))
# Get the intersecting keys.
# Avoids Key Errors in the future, if both dictionaries don't have the same key
for country in last_data_dict.keys() & current_data_dict.keys():
# Only deal with dictionaries that have changed
if last_data_dict[country] != current_data_dict[country]:
# Get intersecting keys between both dictionaries
for key in last_data_dict[country].keys() & current_data_dict[country].keys():
# Calculate the change between updated and previous data
change = current_data_dict[country][key] - last_data_dict[country][key]
# We only care about data that has changed
# Insert data into dictionary
if change != 0:
result[country][key][f"last_{key}_value"] = last_data_dict[country][key]
result[country][key][f"updated_{key}_value"] = current_data_dict[country][key]
result[country][key]["change"] = change
Then you can serialize and output the above data as a JSON formatted string with json.dumps
, since its easier to output a nested defaultdict
this way instead of converting the whole data structure to dict
recursively or some other method. defaultdict
is a subclass of dict
anyways, so it can be treated like a normal dictionary.
print(dumps(result, indent=4))
Additionally, if you don't care about the output, then printing the defaultdict
directly is an easy option as well:
print(result)
# defaultdict(<function <lambda> at 0x000002355BC3AA60>, {'Australia': defaultdict(<class 'dict'>, {'cases': {'last_cases_value': 3045, 'updated_cases_value': 3046, 'change': 1}, 'active': {'last_active_value': 100027, 'updated_active_value': 100028, 'change': 1}}), 'Germany': defaultdict(<class 'dict'>, {'deaths': {'last_deaths_value': 1704, 'updated_deaths_value': 1706, 'change': 2}, 'cases': {'last_cases_value': 6025, 'updated_cases_value': 6026, 'change': 1}, 'active': {'last_active_value': 100027, 'updated_active_value': 100026, 'change': -1}})})
As an extra optional but not needed step, as highlighted above, we could create a recursive function to convert the nested defaultdict
to a normal dictionary with sub levels of type dict
:
def defaultdict_to_dict(df):
result = {}
for k, v in df.items():
if isinstance(v, defaultdict):
result[k] = dict(v)
defaultdict_to_dict(v)
return dict(result)
pprint(defaultdict_to_dict(result))
Which works as intended:
{'Australia': {'active': {'change': 1,
'last_active_value': 100027,
'updated_active_value': 100028},
'cases': {'change': 1,
'last_cases_value': 3045,
'updated_cases_value': 3046}},
'Germany': {'active': {'change': -1,
'last_active_value': 100027,
'updated_active_value': 100026},
'cases': {'change': 1,
'last_cases_value': 6025,
'updated_cases_value': 6026},
'deaths': {'change': 2,
'last_deaths_value': 1704,
'updated_deaths_value': 1706}}}
You can have a look at the full implementation on ideone.com.