0

I am getting \u1F44A\u1F44A where i am expecting \ud83d\udc4d\ud83d\udc4d.

import org.apache.commons.lang3.StringEscapeUtils

val data=""

println(StringEscapeUtils.escapeJava(data))//\u1F44A\u1F44A

println(StringEscapeUtils.unescapeJava("\u1F44A\u1F44A"))//ὄAὄA

println(StringEscapeUtils.unescapeJava("\ud83d\udc4d\ud83d\udc4d"))//

how i get this \ud83d\udc4d\ud83d\udc4d ?

Govind Singh
  • 15,282
  • 14
  • 72
  • 106

3 Answers3

1

Unicode: U+1F44D

UTF-16BE: D8 3D DC 4D

you can see that 1F44D uincode table

So

println(StringEscapeUtils.escapeJava(data))//\u1F44A\u1F44A
println(StringEscapeUtils.unescapeJava("\ud83d\udc4d\ud83d\udc4d"))//

Maybe the IDE console window uses utf-16be? Eclipse can set the console window to use utf-16be or other

enter image description here

Suraj Rao
  • 29,388
  • 11
  • 94
  • 103
xwd
  • 11
  • 1
0

It's a bug in Apache Commons-Lang 3.0 and 3.1. I think it's fixed in 3.2.0, so upgrade to 3.2.x or 3.3.x.

Karol S
  • 9,028
  • 2
  • 32
  • 45
0

I don't think that we need the Apache Commons Library for this. We can easily achieve this in Scala using the standard libraries available.

val data: String =""

println(System.getProperty("file.encoding", "No encoding")))
// prints UTF-8

println(data.map(x => "\\u%04x".format(x.toInt)).mkString)
// prints \ud83d\udc4d\ud83d\udc4d

You can set your encoding by setting file.encoding parameter in the JVM config.

Tested on Scastie for Scala version 2.13.3.

Amit Singh
  • 2,875
  • 14
  • 30