I have some calculated values and I want to save them in SparkR.
If I save it as a csv-file
write.csv(data, file="/.../data.csv", row.names=FALSE)
it takes very long time for some reason. Is there a better way to do this ?
I have some calculated values and I want to save them in SparkR.
If I save it as a csv-file
write.csv(data, file="/.../data.csv", row.names=FALSE)
it takes very long time for some reason. Is there a better way to do this ?
You can save the csv file in /tmp/ for temporary use. But when the cluster restarts, the file will be removed. Specify your the file name as file = "/tmp/filename.csv"
The other choice, you can register your table. see https://spark.apache.org/docs/latest/sparkr.html