1

Have a sample JSON like below which has duplicate keys with field context :

 {
    "Production": {
        "meta_id": "1239826",
        "endeca_id": "EN29826",
        "Title": "Use The Google Home ™ To Choose The Right CCCM Solution For Your Needs",
        "Subtitle": null,
        "context": {
            "researchID": "22",
            "researchtitle": " The Google Home ™: Cross-Channel , Q4 2019",
            "telconfdoclinkid": null
        },
        "context": {
            "researchID": "281",
            "researchtitle": " The Google Home ™: Cross-Channel  Q3 2019",
            "telconfdoclinkid": null
        },
        "context": {
            "researchID": "154655",
            "researchtitle": " Now Tech: Cross-Channel Campaign Management, Q2 2019",
            "telconfdoclinkid": null
        },
        "uri": "/doc/uri",
        "ssd": "ihdfiuhdl",
        "id": "dsadfsd221e"
     }
 }

When I am parsing the JSON for field "context" in scala , it's reject the JSON with a parsing error as below.

Exception in thread "main" org.json.JSONException: Duplicate key "context".

Could you suggest best approach to parse a json in above format using scala.

Andriy Plokhotnyuk
  • 7,883
  • 2
  • 44
  • 68
Raghavan
  • 313
  • 2
  • 10
  • 1
    Is that a valid JSON? I'm not sure a library will accept it. – Gaël J Nov 28 '21 at 15:51
  • No.. It's not a valid JSON ( Error: Duplicate key 'context'). But we have this JSON response after calling a webservice. We cannot change the response in this case. – Raghavan Nov 28 '21 at 16:13
  • 1
    You could roll your own JSON parser (describing how to do that is out of the scope of a StackOverflow answer) which uses a model which allows multiple values per key. You're not going to get much help from existing libraries in doing that, but if the webservice isn't going to be fixed, that's basically what you'd have to do. – Levi Ramsey Nov 28 '21 at 19:23
  • First off, this is not a valid JSON, your parser tries to do something like this after parsing key values: `YourClass(field1 = value1, ..., context = valueN, context = valueM)` so it will be confused. And I don't think any library would support this. you can write your own parser ( as said by Levi Ramsey ) or, try to do something like this, try to find context values by regex, like: `"\"context\":\{[^{^}]\}"` (I'm not sure about the regex), and put the values inside an array or something, then parse the rest of the JSON and append the context values as a list to it – AminMal Nov 28 '21 at 20:24
  • According to [the latest JSON specification it is a valid JSON](https://datatracker.ietf.org/doc/html/rfc8259#section-4). JSON parsers that use a map for in-memory representation of JSON objects ignore key duplicates or return errors. Others can be tweaked to accept duplicates. – Andriy Plokhotnyuk Nov 29 '21 at 07:37

2 Answers2

1

Some JSON parsers for Scala that parse from JSON bytes to your data structures can parse duplicated keys using custom codecs.

Below is an example how it can be done with jsoniter-scala:

Add dependencies to your build.sbt:

libraryDependencies ++= Seq(
  // Use the %%% operator instead of %% for Scala.js  
  "com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-core"   % "2.12.0",
  // Use the "provided" scope instead when the "compile-internal" scope is not supported  
  "com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-macros" % "2.12.0" % "compile-internal"
)

Use data structures and the custom codec from the following snippet:

import com.github.plokhotnyuk.jsoniter_scala.macros._
import com.github.plokhotnyuk.jsoniter_scala.core._

object Example01 {
  case class Context(researchID: String, researchtitle: String, telconfdoclinkid: Option[String])

  sealed trait Issue

  case class Production(
    meta_id: String,
    endeca_id: String,
    Title: String,
    Subtitle: Option[String],
    contexts: List[Context],
    uri: String,
    ssd: String,
    id: String) extends Issue

  implicit val contextCodec: JsonValueCodec[Context] = JsonCodecMaker.make
  implicit val productionCodec: JsonValueCodec[Production] =
    new JsonValueCodec[Production] {
      def nullValue: Production = null

      def decodeValue(in: JsonReader, default: Production): Production = if (in.isNextToken('{')) {
        var _meta_id: String = null
        var _endeca_id: String = null
        var _Title: String = null
        var _Subtitle: Option[String] = None
        val _contexts = List.newBuilder[Context]
        var _uri: String = null
        var _ssd: String = null
        var _id: String = null
        var p0 = 255
        if (!in.isNextToken('}')) {
          in.rollbackToken()
          var l = -1
          while (l < 0 || in.isNextToken(',')) {
            l = in.readKeyAsCharBuf()
            if (in.isCharBufEqualsTo(l, "meta_id")) {
              if ((p0 & 1) != 0 ) p0 ^= 1
              else in.duplicatedKeyError(l)
              _meta_id = in.readString(_meta_id)
            } else if (in.isCharBufEqualsTo(l, "endeca_id")) {
              if ((p0 & 2) != 0) p0 ^= 2
              else in.duplicatedKeyError(l)
              _endeca_id = in.readString(_endeca_id)
            } else if (in.isCharBufEqualsTo(l, "Title")) {
              if ((p0 & 4) != 0) p0 ^= 4
              else in.duplicatedKeyError(l)
              _Title = in.readString(_Title)
            } else if (in.isCharBufEqualsTo(l, "Subtitle")) {
              if ((p0 & 8) != 0) p0 ^= 8
              else in.duplicatedKeyError(l)
              _Subtitle =
                if (in.isNextToken('n')) in.readNullOrError(_Subtitle, "expected value or null")
                else {
                  in.rollbackToken()
                  new Some(in.readString(null))
                }
            } else if (in.isCharBufEqualsTo(l, "context")) {
              p0 &= ~16
              _contexts += contextCodec.decodeValue(in, contextCodec.nullValue)
            } else if (in.isCharBufEqualsTo(l, "uri")) {
              if ((p0 & 32) != 0) p0 ^= 32
              else in.duplicatedKeyError(l)
              _uri = in.readString(_uri)
            } else if (in.isCharBufEqualsTo(l, "ssd")) {
              if ((p0 & 64) != 0) p0 ^= 64
              else in.duplicatedKeyError(l)
              _ssd = in.readString(_ssd)
            } else if (in.isCharBufEqualsTo(l, "id")) {
              if ((p0 & 128) != 0) p0 ^= 128
              else in.duplicatedKeyError(l)
              _id = in.readString(_id)
            } else in.skip()
          }
          if (!in.isCurrentToken('}')) in.objectEndOrCommaError()
        }
        if ((p0 & 247) != 0) in.requiredFieldError(f0(java.lang.Integer.numberOfTrailingZeros(p0 & 247)))
        new Production(meta_id = _meta_id, endeca_id = _endeca_id, Title = _Title, Subtitle = _Subtitle, contexts = _contexts.result(), uri = _uri, ssd = _ssd, id = _id)
      } else in.readNullOrTokenError(default, '{')

      def encodeValue(x: Production, out: JsonWriter): Unit = {
        out.writeObjectStart()
        out.writeNonEscapedAsciiKey("meta_id")
        out.writeVal(x.meta_id)
        out.writeNonEscapedAsciiKey("endeca_id")
        out.writeVal(x.endeca_id)
        out.writeNonEscapedAsciiKey("Title")
        out.writeVal(x.Title)
        x.Subtitle match {
          case Some(s) =>
            out.writeNonEscapedAsciiKey("Subtitle")
            out.writeVal(s)
        }
        x.contexts.foreach { c =>
          out.writeNonEscapedAsciiKey("context")
          contextCodec.encodeValue(c, out)
        }
        out.writeNonEscapedAsciiKey("uri")
        out.writeVal(x.uri)
        out.writeNonEscapedAsciiKey("ssd")
        out.writeVal(x.ssd)
        out.writeNonEscapedAsciiKey("id")
        out.writeVal(x.id)
        out.writeObjectEnd()
      }

      private[this] def f0(i: Int): String = ((i: @annotation.switch): @unchecked) match {
        case 0 => "meta_id"
        case 1 => "endeca_id"
        case 2 => "Title"
        case 3 => "Subtitle"
        case 4 => "context"
        case 5 => "uri"
        case 6 => "ssd"
        case 7 => "id"
      }
    }
  implicit val issueCodec: JsonValueCodec[Issue] = JsonCodecMaker.make(CodecMakerConfig.withDiscriminatorFieldName(None))

  def main(args: Array[String]): Unit = {
    val issue = readFromArray[Issue](
      """
        | {
        |    "Production": {
        |        "meta_id": "1239826",
        |        "endeca_id": "EN29826",
        |        "Title": "Use The Google Home &trade To Choose The Right CCCM Solution For Your Needs",
        |        "Subtitle": null,
        |        "context": {
        |            "researchID": "22",
        |            "researchtitle": " The Google Home ™: Cross-Channel , Q4 2019",
        |            "telconfdoclinkid": null
        |        },
        |        "context": {
        |            "researchID": "281",
        |            "researchtitle": " The Google Home ™: Cross-Channel  Q3 2019",
        |            "telconfdoclinkid": null
        |        },
        |        "context": {
        |            "researchID": "154655",
        |            "researchtitle": " Now Tech: Cross-Channel Campaign Management, Q2 2019",
        |            "telconfdoclinkid": null
        |        },
        |        "uri": "/doc/uri",
        |        "ssd": "ihdfiuhdl",
        |        "id": "dsadfsd221e"
        |     }
        | }
        |""".stripMargin.getBytes("UTF-8"))
    println(issue)
  }
}

Expected output:

Production(1239826,EN29826,Use The Google Home &trade To Choose The Right CCCM Solution For Your Needs,None,List(Context(22, The Google Home ™: Cross-Channel , Q4 2019,None), Context(281, The Google Home ™: Cross-Channel  Q3 2019,None), Context(154655, Now Tech: Cross-Channel Campaign Management, Q2 2019,None)),/doc/uri,ihdfiuhdl,dsadfsd221e)
Andriy Plokhotnyuk
  • 7,883
  • 2
  • 44
  • 68
0

Json4s can parse duplicate keys:

scala> import org.json4s.native.JsonMethods._
import org.json4s.native.JsonMethods._

scala> parse("""{ "hello": true, "context": { "value": "A"}, "context": { "value": "B" }}""")
res2: org.json4s.JValue = JObject(List((hello,JBool(true)), (context,JObject(List((value,JString(A))))), (context,JObject(List((value,JString(B)))))))

Here's the documentation for json4s

Philluminati
  • 2,649
  • 2
  • 25
  • 32
  • BEWARE: [json4s is vulnerable under DoS/DoW attacks!](https://github.com/json4s/json4s/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+denial) – Andriy Plokhotnyuk Jul 04 '22 at 08:40