2

When trying to navigate in Databricks to a specfic folder/file combination using os, 2 identical csv's get randomly recognized as file or directory. When checking with os.path.isfile(), the following returns:

enter image description here

So far I tried re-loading the data, checking if there were any problems with the specific csv's and can't seem to find a pattern as to which file this error will occur with. I tried following advice listed here but the error seems Databricks-specifc.

David Buck
  • 3,752
  • 35
  • 31
  • 35
heck1
  • 714
  • 5
  • 20

1 Answers1

1

One solution I found was reading data using spark :

  df = sqlContext.read.format('com.databricks.spark.csv') \
          .options(header='true', inferSchema='true', sep=';')\
          .load("/mnt/.../.../.../data_2.csv").toPandas()

This will read the data from the csv fine - yet checking with os.path.isfile() still doesn't recognize the file as a file.

heck1
  • 714
  • 5
  • 20