4

I tried solution described in Rename written CSV file Spark but I am getting the following error "java.lang.IllegalArgumentException: Path must be absolute". How could I fix it? It can be in scala or Python code. Thanks :)

import org.apache.hadoop.fs._
val fs = FileSystem.get(sc.hadoopConfiguration)

var table_name = dbutils.widgets.get("table_name")

val filePath = "mnt/datalake/" + table_name + "/"

print("file path: " + filePath)

val fileName = fs.globStatus(new Path(filePath+"part*"))(0).getPath.getName
print("file name: " + fileName)

fs.rename(new Path(filePath+fileName), new Path(filePath+"file.csv"))

Outputs:

file path: mnt/datalake/MyTable/
file name: part-00000-tid-9118XXX-c000.csv

Error

java.lang.IllegalArgumentException: Path must be absolute: mnt/datalake/MyTable/part-00000-tid-9118XXXXc000.csv
user12525899
  • 133
  • 1
  • 10

2 Answers2

1

try this:

import org.apache.hadoop.fs._
import org.apache.hadoop.fs.{FileSystem, Path}
val fs = FileSystem.get(sc.hadoopConfiguration)
val filePath = "dbfs:/FileStore/tables/part_00000-6a99e/"
val fileName = fs.globStatus(new Path(filePath))(0).getPath.getName
fs.rename(new Path(filePath+fileName), new Path(filePath+"file.csv"))

enter image description here

Mahesh Gupta
  • 1,882
  • 12
  • 16
  • Wow! It worked perfectly! Thank you :D This is how I customized it: import org.apache.hadoop.fs._ import org.apache.hadoop.fs.{FileSystem, Path} val fs = FileSystem.get(sc.hadoopConfiguration) val filePath = "dbfs:/mnt/datalake/MyTable/" val fileName = fs.globStatus(new Path(filePath+"part*"))(0).getPath.getName print("file path: " + filePath) print("file name: " + fileName) fs.rename(new Path(filePath+fileName), new Path(filePath+"file.csv")) – user12525899 Dec 12 '19 at 16:45
  • perfect, can you just accept the answer? and vote up – Mahesh Gupta Dec 12 '19 at 16:47
  • Yes, I have clicked up already, but I got the message "Thanks for the feedback! Votes cast by those with less than 15 reputation are recorded, but do not change the publicly displayed post score." My reputation is 10 already so it should appear soon. – user12525899 Dec 12 '19 at 17:03
  • I have just accepted it :) And now I have 22 reputation so I also voted it up :) – user12525899 Dec 12 '19 at 17:18
0

In Databricks, dbfs is databricks file system and the absolute path, which is the full path, must begin with dbfs. We might compare dbfs to the root directory in a linux file system, which is why the full path in linux start from the root directory.