0

I have a SQL view stored in Databricks as a table and all of the columns are capitalised. When I load the table in a Databricks job using spark.table(<<table_name>>), all of the columns are converted to lowercase which causes my code to crash. However, when I load the table the same way in a simple notebook, the column names remain capitalised and are NOT turned to lowercase.

Has anyone encountered this issue before? It is strange because it is only happening in the job.

stosxri
  • 51
  • 5

2 Answers2

1

Solved this by changing the Runtime Version of the cluster used in the Databricks Job. Seems like that specific Runtime Version was automatically converting all column names to lowercase.

stosxri
  • 51
  • 5
0

Make sure you to check the whole process once again. I have repro’d it in our environment, and I didn’t get any lowercase columns in my result.

I have created an SQL view names_view in databricks for my repro and this is my Notebook run named forview.

enter image description here

Databricks Job run:

enter image description here

I suggest you try with spark.sql() to load the SQL view and check like below.

view_df=spark.sql("select * from names_view")

enter image description here

If you didn’t succeed, try to do it in another cluster or another databricks workspace and check if possible.

If the issue still persists, please reach out to Azure Support or can raise a Github Issue.

Rakesh Govindula
  • 5,257
  • 1
  • 2
  • 11
  • Thank you Rakesh! We actually managed to find the issue and it was the Cluster Runtime we were using. For some reason, the Cluster Runtime used on the job was automatically converting all column names to lowercase. We changed the runtime and problem solved! – stosxri Jul 09 '22 at 10:26