I am reading a DynamoDB Table with Glue, due to the dynamic schema it can happen that some columns are not existing. Adding them works fine with the following code but I am not sure how to make the function dynamic if I need to add multiple columns.
# add missing columns if not available
def AddCustRegName(r):
r["customerRegistrationName"] = "" # add column with empty string.
return r
if addCustRegName:
case_df_final = Map.apply(frame=case_df_final, f=AddCustRegName)
Any suggestions?
The following code is failing with the below error
# add missing columns if not available
def AddColumn(r, col):
r[col] = "" # add column with empty string.
return r
case_df_final = Map.apply(frame=case_df_final, f=AddColumn(case_df_final ,'accessoryTaxIncluded'))
case_df_final.toDF().printSchema()
Fail to execute line 6: case_df_final = Map.apply(frame=case_df_final, f=AddColumn(case_df_final ,'accessoryTaxIncluded')) Traceback (most recent call last): File "/tmp/zeppelin_pyspark-4928209310219195923.py", line 375, in exec(code, _zcUserQueryNameSpace) File "", line 6, in File "", line 3, in AddColumn TypeError: 'DynamicFrame' object does not support item assignment