I would like to read events from eventhub using Databricks, events are in json format but they can have different schema (it's important because i find solutions in which the schema was given to from_json(jsonStr,schema) function, but i cannot use it in my use case). When i use
.withColumn('Value', col('value').cast(StringType()
in dataframe returns json output with backslashes "{\"time\": 1432826855000,\"host\":......
.
I found a solution How to prevent spark sql with kafka from adding backslash to JSON string in dataframe but in Delta Live Tables framework we create streaming tables by returning a dataframe, so i cant use this solution.
Should i use non pyspark functions in etl process such as How to remove backslash from decoded JSON string? ? Will it be efficient during streaming from eventhub to bronze?