-1

I'm using Java-Spark to load JSON into Dataset as follow:

 Dataset<Row> df = spark.read().json(jsonFile);

Let's say that my JSON looks like:

{
    "field1":
    {
        "key1":"value1"
    }
 }

Now I want to add a new field to make my JSON to looks like:

{
    "field1":
    {
        "key1":"value1",
        "key2":"value2"
    }
 }

So what I'm did is:

df = df.withColumn("field1.key2", function.lit("value2"));

But my JSON looks like:

{
    "field1":
    {
        "key1":"value1"
    },
     "field1.key2":"value2"
}

So how can I fix it?

Thanks.

Ya Ko
  • 509
  • 2
  • 4
  • 19
  • Possible duplicate of [How to add a constant column in a Spark DataFrame?](https://stackoverflow.com/questions/32788322/how-to-add-a-constant-column-in-a-spark-dataframe) – philantrovert Sep 04 '18 at 11:59

1 Answers1

0

one option can be, you can read the files as text file and within the map operation, you can use create the json object and do the necessary modifications to the record, something like below:

import org.json.JSONObject

val input = sparkSession.sparkContent.textFile("<input_file_path>")

val resultRDD = input.map(row => {

    val json = new JSONObject(row)
    json.getJSONObject("field1").put("key2", "value2")

    json.toString
})

val resultDF = sparkSession.read.json(resultRDD)
resultDF.printSchema()
Prasad Khode
  • 6,602
  • 11
  • 44
  • 59
  • Can I do it with Spark Dataset? – Ya Ko Sep 04 '18 at 17:13
  • 1
    I don't have a solution for now, to achieve this in Spark Dataset way, there is an open ticket regarding the same, in case you want to [check](https://stackoverflow.com/q/49278366/1025328) – Prasad Khode Sep 05 '18 at 06:48