4

I need to write a data frame to a single csv file, and found out that I can use sdf_coalesce() to turn the file into a single partition. I want to find out if there's any way I can change the name of the csv file generated by spark_write_csv()?

Thanks in advance.

  • Can you not do it in the `path` argument, as you would with `readr::write_csv`? – Relasta Apr 04 '18 at 13:45
  • As spark deals with distributed computing, it tend to generate partitioned files which correspond to each partition of the data, the coalesce implies group all the data to just one partition, so you can run out of memory. – Jader Martins Apr 13 '18 at 01:55

1 Answers1

5

No. Filename is generated automatically to be unique among different tasks and it is not configurable. If you want specific name you have rename the output afterwards using utilities specific for the file system / storage solution in use.