I have a file of multiple API responses in json format. They look like this:
{
"address": "0x1j2jfgn1o2n3b1o3jbo12",
"risk": "Low",
"cluster": {
"name": "foobar",
"category": "foo"
},
"addressIdentifications": []
}
Sometimes, that addressIdentifications
list is populated with one or multiple dicts:
"addressIdentifications": [
{
"name": "foobar",
"category": "scam",
"description": "description_goes_here"
}
]
When calling the API, I load all of the json responses into a list called "data"
data = []
data.append(json.loads(response.text))
And then I try to parse and flatten the list into a Pandas Dataframe using json_normalize
:
df_out = pd.DataFrame(
pd.json_normalize(
data,
meta=['address','risk',['cluster','name'],['cluster','category']],
record_path='addressIdentifications',
record_prefix='addressIdentification_'))
This works fine for the responses where addressIdentifications
is populated. However, it does not work for those where addressIdentifications
is just an empty list. It just returns an empty dataframe, not even populating the other columns. In that case, the normal pd.json_normalize(data)
works fine. But I can't seem to have it both ways.
How can I go through a list of json responses and parse them properly depending on if addressIdentifications
is populated or not?