I have a dataframe generated from Spark which I want to use for writeStream
and also want to save in a database.
I have the following code:
output = (
spark_event_df
.writeStream
.outputMode('update')
.foreach(writerClass(**job_config_data))
.trigger(processingTime="2 seconds")
.start()
)
output.awaitTermination()
As I am using foreach()
, writerClass
gets a Row
and I can not convert it into a dictionary in python.
How can I get a python datatype(preferably dictionary) from the Row
in my writerClass
so that I can manipulate that according to my needs and save into database?