1

I have a glue column whose datatype in Glue is struct<quantity:bigint,unit:bigint>

However when spark infers this schema, it converts this glue type to spark metadata and saves it to Glue table properties as follows:

  "name": "columnName",
"type": {
    "type": "struct",
    "fields": [
        {
            "name": "quantity",
            "type": "long",
            "nullable": true,
            "metadata": {}
        },
        {
            "name": "unit",
            "type": "long",
            "nullable": true,
            "metadata": {}
        }
    ]
},
"nullable": true,
"metadata": {} }

Is there a library or any inbuilt function that glue or spark has that can help me with the conversion of glue column type to Spark metadata in Java..? I have to convert those glue datatypes to Spark metadata

The in coming glue columns datatype can also be a nested structure of maps arrays and structs as well

Another example of Glue Datatype: struct<column1:string,averageHeight:double employeeName:string,firstName:string,secondName:string,listOfBooks:bigint,price:bigint,studentId:bigint,offerPrice:struct<quantityOfBooks:bigint,class:bigint>,bookStore:string,reviewCount:bigint,author:string,title:string,studentRollNo:string> Spark conversion:

{
    "name": "studentData",
    "type": {
        "type": "struct",
        "fields": [
            {
                "name": "column1",
                "type": "string",
                "nullable": true,
                "metadata": {}
            },
            {
                "name":"averageHeight",
                "type": "double",
                "nullable": true,
                "metadata": {}
            },
            {
                "name": "employeeName",
                "type": "string",
                "nullable": true,
                "metadata": {}
            },
            {
                "name": "firstName",
                "type": "string",
                "nullable": true,
                "metadata": {}
            },
            {
                "name": "secondName",
                "type": "string",
                "nullable": true,
                "metadata": {}
            },
            {
                "name": "listofBooks",
                "type": "long",
                "nullable": true,
                "metadata": {}
            },
            {
                "name": "price",
                "type": "long",
                "nullable": true,
                "metadata": {}
            },
            {
                "name": "studentId",
                "type": "long",
                "nullable": true,
                "metadata": {}
            },
            {
                "name": "offerPrice",
                "type": {
                    "type": "struct",
                    "fields": [
                        {
                            "name": "quantityOfBooks",
                            "type": "long",
                            "nullable": true,
                            "metadata": {}
                        },
                        {
                            "name": "class",
                            "type": "long",
                            "nullable": true,
                            "metadata": {}
                        }
                    ]
                },
                "nullable": true,
                "metadata": {}
            },
            {
                "name": "bookStore",
                "type": "string",
                "nullable": true,
                "metadata": {}
            },
            {
                "name": "reviewCount",
                "type": "long",
                "nullable": true,
                "metadata": {}
            },
            {
                "name": "author",
                "type": "string",
                "nullable": true,
                "metadata": {}
            },
            {
                "name": "title",
                "type": "string",
                "nullable": true,
                "metadata": {}
            },
            {
                "name": "studentRollNo",
                "type": "string",
                "nullable": true,
                "metadata": {}
            }
        ]
    },
    "nullable": true,
    "metadata": {}
}

Note need to do this in Java. I'm aware of Dataframes in spark and converting them to df.prettyJson to get the spark metadata conversion of the glue type. However I need to do this conversion via Java code. What is the best possible approach for this conversion ..?

0 Answers0