I'm struggling at the moment to work out how to parse out some JSON where the keys are unique so i'm not sure how to do this with the normal pyspark sql functions. Is it possible to do the below without a UDF?
{
"key 1": "value 1",
"key 2": {
"key 3": {
"key 4": {
"key 5": "value 2",
"key 6": "value 3"
},
"key 7": {
"key 5": "value 4",
"key 6": "value 5"
},
"key 8": {
"key 5": "value 6",
"key 6": "value 7"
}
}
}
}
My goal is to basically turn it into the following table:
| A | B | C | D | E | F |
|---------|-------|-------|-------|---------|---------|
| value 1 | key 2 | key 3 | key 4 | value 2 | value 3 |
| value 1 | key 2 | key 3 | key 7 | value 4 | value 5 |
| value 1 | key 2 | key 3 | key 8 | value 6 | value 7 |
If key 4,7 and 8 were the same i'd be able to parse out the below easily however i can't find any context on doing this where the keys for 4,7,8 and so on are unknown and therefore can't be defined in the get_json_object
function.
Thanks