0

Currently I'm working on exploding a struct array with pair of keys are same.

{
    "A": [{
        "AA": {
            "AB": "21",
            "AC": "R",
            "AD": "20222832522117601",
            "AE": "2",
            "AF": {
                "AFA": "3",
                "AFB": "3",
                "AFA_1": "2",
                "AFB_1": "2",
                "AFA_2": "4",
                "AFB_2": "4",
                "AFA_3": "6",
                "AFB_3": "6",
                "AFA_4": "1",
                "AFB_4": "1",
                 .
                 .
                 .
                "AFA_99": "111",
                "AFB_99": "111"
            },
            "AG": "3",
            "AH": "03"
        },
        "AAA": {
            "AAB": "22",
            "AAC": "1",
            "AAD": ""
        }
    }]
}

Expecting the output to be :

AF
(3,3),(2,2)
val test = sid28Cols_Struct.select(col("AF.*"))

val test_new = test.withColumn("a",explode(from_json(toJsonArray($"AF.AFA"),schema)))

I did used the explode function but it was only able to write

AFA AFB
,3 ,4

What is the best way to avoid the ambiguous error while exploding struct array

Update: I'm able to achieve the renaming unique columns with python before processing it in dataframe and able to get the columns respective values but still not able to get them in tuples

AF
(3,3),(2,2),(4,4),(6,6),(1,1)....(111,111)

1 Answers1

0

That seems to be invalid JSON. Perhaps rename the duplicate keys in the object before attempting to process?

Rimer
  • 2,054
  • 6
  • 28
  • 43