The Question is:
Complete the writeToBronze
function to perform the following tasks:
- Write the stream from
gamingEventDF
-- the stream defined above -- to a bronze Delta table in path defined byoutputPathBronze
. - Convert the (nested) input column
client_event_time
to a date format and rename the column toeventDate
- Filter out records with a null value in the
eventDate
column - Make sure you provide a checkpoint directory that is unique to this stream
Code :
def writeToBronze(sourceDataframe, bronzePath, streamName):
(sourceDataframe
.withColumn("eventDate",
to_date(col("eventParams.client_event_time"), "yyyy-MM-dd"))
.filter(col("eventDate").isNotNull())
.writeStream
.format("delta")
.option("checkpointLocation", f"{bronzePath}_checkpoint")
.queryName(streamName)
.outputMode("append")
.start(outputPathBronze)
)
writeToBronze(gamingEventDF, outputPathBronze, "bronze_stream")