I had a csv file stored in azure datalake storage which i imported in databricks by mounting the datalake account in my databricks cluster, After doing preProcessing i wanted to store the csv back in the same datalakegen2 (blobstorage) account.Any leads and help on the issue is appreciated.Thanks.
Asked
Active
Viewed 1,546 times
2 Answers
0
Just write a file in the same mounted location. See example from here: https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake-gen2.html#example-notebook
df.write.json("abfss://<file_system>@<storage-account-name>.dfs.core.windows.net/iot_devices.json")

silent
- 14,494
- 4
- 46
- 86
-
Thanks a lot, got it, but it is saving the csv file as a random name , any help to save it as my own defined filename.Thanks for the help ! – inr Sep 27 '19 at 11:03
-
we can't write the file with specific name while writing into hdfs, using coalesce(1) generate a single file, rename the file once it is generated using command hadoop fs -mv *.something desired_name – Pabbati Sep 27 '19 at 13:48
0
Just save it directly to Blob storage.
df.write.
format("com.databricks.spark.csv").
option("header", "true").
save("myfile.csv")
There is no point in saving the file locally and then pushing it into the Blob.

ASH
- 20,759
- 19
- 87
- 200