0

Below is a piece of sample json schema. I want my pyspark dataframe to read netWorthOfTheCompany as column and float as its data type. But currently when i read the json schema and save it in dataframe & print(df.dtypes) it prints as string as it treats it string in the schema. I dont want to create a custom schema & write all the struct type & struct fields in it because the json schema is too long.

{
    "turnover": {
        "netWorthOfTheCompany": "float", 
        "totalTurnover": "float"
    }
}

This is the line of code where i am reading the json schema & saving it in a dataframe. df=spark.read.option("multiline","true").json(filepath)

I want to read the value of the key in json schema as its data type & not as string & it should map to the corresponding data type available in pyspark. ==> netWorthOfTheCompany : type(float)

0 Answers0