3

I try pretty-writing a JString containing a character with json4s as follows:

import org.joda.time.format.ISODateTimeFormat
import org.joda.time.{DateTime, DateTimeZone}
import org.json4s.native.Serialization.writePretty
import org.json4s.{DateFormat, DefaultFormats, Formats, JString}

import java.util.{Date, TimeZone}

object Json4sEncodingTest {

  val formats = new Formats {

    val dateFormat: DateFormat = new DateFormat {
      override def parse(s: String): Option[Date] =
        try {
          Option(
            DateTime
              .parse(s, ISODateTimeFormat.dateTimeParser().withZoneUTC())
              .withZone(DateTimeZone.forID(timezone.getID))
              .toDate
          )
        } catch {
          case e: IllegalArgumentException => None
        }
      override def format(d: Date): String = DefaultFormats.lossless.dateFormat.format(d)
      override def timezone: TimeZone = DefaultFormats.lossless.dateFormat.timezone
    }

    override def alwaysEscapeUnicode: Boolean = false
  }

  def main(args: Array[String]): Unit = {
    println(writePretty(JString("2€"))(formats))
  }

}

This results in:

"2\u20ac"

My expected result would be:

"2€"

I found that in org.json4s.ParserUtil.quote characters between \u2000 and \u2100 are always escaped.

Question: Why is this the case?

  • json4s version: 3.7.0-M7
  • scala version: 2.12.11
Tomer Shetah
  • 8,413
  • 7
  • 27
  • 35
llehmann
  • 31
  • 1

1 Answers1

2

As elaborated in this github issue, it is impossible currently to do this using json4s native. The code that checks if to escape or not is:

(c >= '\u0000' && c <= '\u001f') || (c >= '\u0080' && c < '\u00a0') || (c >= '\u2000' && c < '\u2100')

while doesn't satisfy this condition. One possible solution (well, sort of solution) is using jackson instead of native. Then this will work:

import org.json4s.jackson.JsonMethods._
import org.json4s.JsonAST.JString

println(pretty(render(JString("2€"))))

Code run at Scastie.

Tomer Shetah
  • 8,413
  • 7
  • 27
  • 35