0

We have a maven test framework project, written in scalatest, in IntelliJ

A testcase makes use of databricks connect, to read and write to DBFS

If we right click and run the testcase, all is successful successful.

However if we run the test case via 'mvn test', it falls over with:

org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "dbfs"
  at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3390)
  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3411)
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:158)
  at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3474)
  at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3442)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:524)
  at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
  at org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:46)
  at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:366)
  at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:297)
  ...

How can we run a test, dependent on databricks connect, successfully via maven?

  • How do you specify db-connect jar's in the Maven? – Alex Ott Nov 02 '22 at 18:49
  • @AlexOtt under project structure, we add Jar directory: Python37\Lib\site-packages\pyspark\jars (which contains many jars we probably don't use) –  Nov 03 '22 at 15:43

0 Answers0