I use azure synapse notebooks (pyspark) to process data from blob storage and write it into a SQL server. I'm using the below to write to the SQL table:
df.write \
.format("jdbc") \
.option("url", <...>) \
.option("dbtable", <...>) \
.option("user", <...>) \
.option("password", <...>) \
.mode("append") \
.save()
This works fine, but you have to match the columns in the dataframe perfectly to those in your SQL table before it works. I'm trying (and failing) to find a way to use a mapping dictionary to pass in, instead of changing all the column names/ordering/etc to match the SQL table.
For example, let's say we have a dataframe and sql table as below:
dataframe: name | job | birthday SQL table: Full Name | Occupation | Date of Birth
I would like to do something like the following:
df_sql_mapping = {
'name' : 'Full Name',
'job': 'Occupation',
'birthday': 'Date of Birth'
}
df.write \
...\
.mapping(df_sql_mapping) \
.save()
Any ideas? There is very limited documentation available.