0

I am using Databricks Unity Catalog, and I have a requirement to upload a CSV file, process it, and load it into a final table. However, when uploading the file in Databricks, it converts NULL data to the string 'NULL', which is causing an issue. Do you have any ideas on how I can resolve this problem?

1 Answers1

1

CSV files by definition doesn't have any way to specify null values - everything is treated as a string. If you have some placeholder value inside your CSV, then you can pass the nullValue parameter when reading the CSV data to specify what strings would be treated as nulls (see doc):

df = spark.read.csv(path, nullValue="null")

or specify it as option:

df = spark.read.format("csv") \
  .option("nullValue", "null")
  .load(path)
Alex Ott
  • 80,552
  • 8
  • 87
  • 132
  • Thanks @Alex, but I am uploading the data and creating a delta table out of it. I am not directly reading from the file. – SK ASIF ALI Jul 03 '23 at 15:34