0

I have a file of multiple API responses in json format. They look like this:

{
    "address": "0x1j2jfgn1o2n3b1o3jbo12",
    "risk": "Low",
    "cluster": {
        "name": "foobar",
        "category": "foo"
    },
    "addressIdentifications": []
}

Sometimes, that addressIdentifications list is populated with one or multiple dicts:

"addressIdentifications": [
        {
            "name": "foobar",
            "category": "scam",
            "description": "description_goes_here"
        }
    ]

When calling the API, I load all of the json responses into a list called "data"

data = []
data.append(json.loads(response.text))

And then I try to parse and flatten the list into a Pandas Dataframe using json_normalize:

df_out = pd.DataFrame(
    pd.json_normalize(
        data,
        meta=['address','risk',['cluster','name'],['cluster','category']],
        record_path='addressIdentifications',
        record_prefix='addressIdentification_'))

This works fine for the responses where addressIdentifications is populated. However, it does not work for those where addressIdentifications is just an empty list. It just returns an empty dataframe, not even populating the other columns. In that case, the normal pd.json_normalize(data) works fine. But I can't seem to have it both ways.

How can I go through a list of json responses and parse them properly depending on if addressIdentifications is populated or not?

tbw875
  • 359
  • 3
  • 5
  • 12

0 Answers0