1

Currently I am using this to write output in single partition.

 df.coalesce(1).write
.format("json")
.mode("overwrite")
.option("path",writePath)
.save

Ouput file is currently in this format :

{obj1} {obj2}

I want this as an array of json object. [{obj1}, {obj2}]

1 Answers1

0

Spark infers and writes json files where each line is a separate, self-contained valid JSON object. https://spark.apache.org/docs/latest/sql-data-sources-json.html

However for your desired output,

df.toJSON.collect.mkString("[", "," , "]" )

Note that collect on large dataframes is not recommended. More info

Aditya
  • 336
  • 5
  • 12
  • thanks, but not able to use df.write method after using above function – user17773575 Mar 22 '22 at 14:29
  • Obviously, df.write wont work. I made it a string. As per Spark, each row is an independent JSON object. If you want to use df write then just collect the df -> make String -> Convert to a df of single row single column -> use df.write – Aditya Apr 01 '22 at 14:49