df.withColumn("event", ..)
How to add a new column event to a dataframe which will be the result of generate_header
? How can we add a Row as the column value ?
May be we need to convert the function to UDF
def generate_header(df_row):
header = {
"id": 1,
...
}
return EntityEvent(header, df_row)
class EntityEvent:
def __init__(self, _header, _payload):
self.header = _header
self.payload = _payload
Let's suppose we have something like this
+---------------+--------------------+
|book_id |Author |
+---------------+--------------------+
|865731 |{name: 'A', } |
+---------------+--------------------+
and we want to get this
+---------------+--------------------+------------------------------
|book_id |Author | event |
+---------------+--------------------+------------------------------+
|865731 |{name: 'A', } | {header: { id: '865731'}, payload: {name: 'A'}}
+---------------+--------------------+----------------------------------------------------------