Questions tagged [unicode-escapes]

Use this tag for questions related to Unicode Escapes, a Unicode character Escape sequence represents a Unicode character.

Quoting the MSDN page:

A Unicode escape sequence represents the single Unicode character formed by the hexadecimal number following the "\u" or "\U" characters. Since C# uses a 16-bit encoding of Unicode code points in characters and string values, a Unicode character in the range U+10000 to U+10FFFF is not permitted in a character literal and is represented using a Unicode surrogate pair in a string literal. Unicode characters with code points above 0x10FFFF are not supported.

Notice that is used in its general meaning, thus you are encouraged to tag your question with the corresponding programming environment as well.

318 questions
3
votes
1 answer

Convert special Characters to Unicode Escape Characters Scala

Hello is it possible to convert special Characters like Ä, Ö, Ü, ... to Unicode Escape Characters? Like Ä would be \u00C4 I need this to convert a File for Translation purposes. I have a key value pair of Translations on Server Side e.g.: hometown…
B. Kemmer
  • 1,517
  • 1
  • 14
  • 32
3
votes
3 answers

Print unicode literal string as Unicode character

I need to print a unicode literal string as an equivalent unicode character. System.out.println("\u00A5"); // prints ¥ System.out.println("\\u"+"00A5"); //prints \u0045 I need to print it as ¥ How can evaluate this string a unicode character ?
Pradeep
  • 97
  • 1
  • 8
3
votes
1 answer

Catalan characters à and è don't work with php imagestringup - how to decode them?

When I call the below code with $text with Spanish I got correct text with image but When I call the same code with $text with Catalan I don't get correct text in the image. I understand that Spanish Special chars á and é are working but Catalan…
App tester
  • 139
  • 1
  • 11
2
votes
2 answers

Parsing JSON with escaped unicode characters displays incorrectly

I have downloaded JSON data from Instagram that I'm parsing in NodeJS and storing in MongoDB. I'm having an issue where escaped unicode characters are not displaying the correct emoji symbols when displayed on the client side. For instance, here's a…
bflemi3
  • 6,698
  • 20
  • 88
  • 155
2
votes
1 answer

R Package cmd check - unable to identify non-ascii character

I've written a small function that returns the categories of the ICD-10, since I use them frequently. The functions works as expected, however when I want to integrate it into my package it gives me the following error message. I replaced the german…
Björn
  • 1,610
  • 2
  • 17
  • 37
2
votes
1 answer

How to handle escape characters in pyspark. Trying to replace escape character with NULL

I'm trying to replace a escape character with NULL in pyspark dataframe. Data in dataframe looks like below Col1|Col2|Col3 1|\026\026|026|abcd026efg. Col2 is a garbage data and trying to replace with NULL. Tried replace and regex_replace…
EVR
  • 31
  • 4
2
votes
1 answer

random text from /dev/random raising an error in lxml: All strings must be XML compatible: Unicode or ASCII, no NULL bytes

I am, for the sake of testing my web app, pasting some random characters from /dev/random into my web frontend. This line throws an error: print repr(comment) import html5lib print html5lib.parse(comment,…
Abhishek
  • 21
  • 1
  • 3
2
votes
0 answers

Is there any way to write mix directional text with number without any Unicode Directional Formatting code

There is a requirement to write many Arabic words and english words along with numerical digits without using any special character in between. Initially I used U+202C(pop directional formatting code) whenever I detected an Arabic text. However that…
2
votes
2 answers

How to escape unicode special chars in string and write it to UTF encoded file

What I aim to achieve is to: string like: Bitte überprüfen Sie, ob die Dokumente erfolgreich in System eingereicht wurden, und löschen Sie dann die tatsächlichen Dokumente. convert to: 'Bitte \u00FCberpr\u00FCfen Sie, ob die Dokumente erfolgreich…
PiWo
  • 590
  • 2
  • 8
  • 17
2
votes
1 answer

Portable way to change Prolog atom JSON escaping

Is there a portable way to change the Prolog escaping. I have the following in mind, usualy an atom is escaped as follows, for example using octal escaping: /* SWI-Prolog 8.3.23 */ ?- X = 'abc\x0001\def'. X = 'abc\001\def'. But what I want to…
user502187
2
votes
1 answer

Convert unicode codepoint to utf-16

In C++ on Windows how do you convert an xml character reference of the form &#xhhhh; to a utf-16 little endian string? I'm thinking if the hhhh part is 4 characters or less, then it's 2 bytes, which fit into one utf-16 character. But, this wiki page…
Scott Langham
  • 58,735
  • 39
  • 131
  • 204
2
votes
1 answer

AWS CLI returns JSON with control codes making JQ fail

I've used jq many times to parse, pick values etc from JSON returned by AWS CLI, e.g. for ec2 describe-instances etc. Now I'm using the dockerized version of AWS CLI v2 to get a list of CloudWatch log groups: $ alias aws='docker run --rm -it -v…
JHH
  • 8,567
  • 8
  • 47
  • 91
2
votes
2 answers

How to decode partially escaped unicode string in python (mixed unicode and escaped unicode)?

Given the following string: str = "\\u20ac €" How to decode it into € €? Using str.encode("utf-8").decode("unicode-escape") returns € â\x82¬ (To clarify, I am looking for a general solution how to decode any mix of unicode and escaped characters)
serg
  • 109,619
  • 77
  • 317
  • 330
2
votes
5 answers

UNIX/Linux shell script: Removing variant form emoji from a text

Consider you are using a Linux/UNIX shell whose default character set is UTF-8: $ echo $LANG en_US.UTF-8 You have a text file, emoji.txt, which is coded in UTF-8: $ file -i ./emoji.txt ./emoji.txt: text/plain; charset=utf-8 This text file contains…
Culip
  • 559
  • 8
  • 24
2
votes
0 answers

Python 3 converting escaped unicode characters in strings into the characters themselves

I've got some data stored as strings that contains both unicode characters (e.g., ñ) and unicode escape sequences (e.g., \u00F1). I would like to do a string-to-string transformation that converts the escape sequences into the corresponding unicode…
Jolyon
  • 165
  • 1
  • 7