I'm working on ingesting raw .csv files stored in an ADLS storage account into Bronze layer Delta live tables in Databricks. The challenge arises from the column names in the CSV files, which include spaces and backticks ( ` ). Unfortunately, since Databricks uses backticks as the native escape identifier, I have been unable to use the usual methods to escape from them.
Column Names:
date
Reporting Status
`MSA
`BU ID
Here's a snippet of code I'm using to ingest the columns from the raw file
CREATE OR REFRESH STREAMING LIVE TABLE t1
LOCATION "/mnt/adls/bronze/"
AS
SELECT
date
,`Reporting Status` AS Reporting_Status
,"\`MSA" AS MSA
,"\`BU ID" AS BU_ID
FROM cloud_files("/mnt/adls/raw/", "csv" ,
map("header", "true"
, "escape", "\""
, "quote", "\""
));
The end result shows the raw column name as strings in every row of the columns: Results
When using the backticks instead of the double quotes, I am unable to escape from the backtick, and my workflow will error out.
Ex
`\`MSA`
Is there a way in databricks SQL to accommodate for backticks in the column names?