Recently, Databricks launched Databricks Connect
that
allows you to write jobs using Spark native APIs and have them execute remotely on an Azure Databricks cluster instead of in the local Spark session.
It works fine except when I try to access files in Azure Data Lake Storage Gen2. When I execute this:
spark.read.json("abfss://...").count()
I get this error:
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
Does anybody know how to fix this?
Further information:
- databricks-connect version: 5.3.1