I want to upload a dataframe to a server as csv file with Gzip encoding without saving it on the disc.
It is easy to build some csv file with Gzip encoding using spark-csv
lib:
df.write
.format("com.databricks.spark.csv")
.option("header", "true")
.option("codec", "org.apache.hadoop.io.compress.GzipCodec")
.save(s"result.csv.gz")
But I have no idea how to get Array[Byte]
, representing my DataFrame
, which I can upload via HTTP