0

How to print the date of a file in Scala is explained here.

My question is how I can get a variable containing this information which can be returned as a column to a dataframe. None of the conversions I would expect to be allowed, actually are allowed.

My code (using Scala 2.11):

import org.apache.spark.sql.functions._
import java.nio.file.{Files, Paths} // Needed for file time
import java.nio.file.attribute.BasicFileAttributes
import java.util.Date

def GetFileTimeFunc(pathStr: String): String = {
// From: https://stackoverflow.com/questions/47453193/how-to-get-creation-date-of-a-file-using-scala
  val FileTime = Files.readAttributes(Paths.get(pathStr), classOf[BasicFileAttributes]).creationTime;
  val JavaDate = Date.from(FileTime.toInstant);
  return(JavaDate.toString())
}
@transient val GetFileTime = udf(GetFileTimeFunc _)

val filePath = "dbfs:/mnt/myData/" // location of data
val file_df = dbutils.fs.ls(filePath).toDF // Output columns are $"path", $"name", and $"size"
  .withColumn("FileTimeCreated", GetFileTime($"path"))
display(file_df)//.select("name", "size"))

Output:

SparkException: Failed to execute user defined function($anonfun$2: (string) => string)

For some reason, Instant is not allowed as a column type, so I cannot use it as a return type.The same for FileTime, JavDate, etc.

M.S.Visser
  • 61
  • 1
  • 6
  • Do you use Java NIO? – Duelist May 18 '20 at 13:46
  • Hi Duelist, yes I do. I updated the code sample. – M.S.Visser May 19 '20 at 15:08
  • I think you get the exception because NIO doesn't know how to work with Databricks File System – Duelist May 19 '20 at 15:21
  • Odd. NIO is allowed to read /mnt/, but not dbfs:/mnt/ and neither /mnt/myData. Probably it reads a different /mnt/ than the one on dbfs. So I think you are right about NIO not being allowed to read from dbfs. Is there any way to read file attributes from the DataBricks filesystem? – M.S.Visser May 19 '20 at 23:17

0 Answers0