I would like to write a spark dataframe to stringIO as a single partitioned csv. This singled-partioned csv is then supposed to be sent to another server using ftp.
the following line does not seem to work:
df.repartition(1).write.csv(file_buffer,mode="overwrite", header=True)
The output is the following error:
py4j.protocol.Py4JError: An error occurred while calling o148.csv. Trace:
py4j.Py4JException: Method csv([class java.util.ArrayList]) does not exist
from ftplib import FTP
import StringIO
file_buffer = StringIO.StringIO()
df.repartition(1).write.csv(file_buffer,mode="overwrite", header=True)
ftp = FTP()
ftp.connect(host, 21)
ftp.login(user=user, passwd=pw)
ftp.storbinary('test.csv', file_buffer)
ftp.quit()
I've also tried df.coalesce(1).write.csv(file_buffer,mode="overwrite", header=True)
. However, that returns the same error.
Btw, I can principally write to S3 with the above mentioned method.
Many thanks in advance!