The unicode for lower case s
is U+0073 , which this website says is \u0073
in C and Java.
Given a file: a.txt
containing:
http://www.example.com/\u0073
Let's read this with Java, and unescape the \
and see what we get:
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import org.apache.commons.lang3.StringEscapeUtils;
public class Main {
public static void main(String[] args) throws IOException {
String s2 = new String(Files.readAllBytes(Paths.get("a.txt")));
System.out.println(s2); // prints http://www.example.com/\u0073
String s3 = org.apache.commons.lang3.StringEscapeUtils.unescapeJava(s2);
System.out.println(s3); // prints http://www.example.com/s
}
}
The output is:
$ java -cp ./commons-lang3-3.4.jar:. Main
http://www.example.com/\u0073
http://www.example.com/s
The unescapeJava(s2)
method call takes the \\u0073
from the file and unescapes to \u0073
, which then printed becomes "s".
Can we do the same in Haskell?
Let's consume these two files with the text library:
Prelude > a <- Data.Text.IO.readFile "a.txt"
Prelude > a
"http://www.example.com/\\u0073\n"
Any expectation of automatic translation from \u0073
to s
in Haskell could be confused by the \x
rather than \u
prefix for carrying out such expectations:
Prelude> "\x0073"
"s"
So how do I take unescapeJava(..)
method in apace-common-lang, and replicate its functionality in Haskell to go from \\u0073
to \u0073
, and to print this as "s"?