Questions tagged [utf-16]

UTF-16 is a character encoding that represents Unicode code points using either 2 or 4 bytes per character.

UTF-16 is a character encoding that describes code points in byte sequences of either two or four bytes. It is therefore a variable-width character encoding.

The algorithm for encoding code points as UTF-16 is described in RFC 2781.

There are three flavors of UTF-16, little-endian, big-endian and with BOM (see ).

Related tags

1193 questions
-1
votes
2 answers

Python: Determining if i have a 16bit encoded string

I have a UTF-16-BE encoded string: utf16be = '\x0623\x0631\x0646\x0628' print repr(utf16be) > '\x0623\x0631\x0646\x0628' I need to know if it's a 1-byte or 2-byte encoding, i have tried with the below snippet: for c in utf16be: c_ord = ord(c) …
zfou
  • 891
  • 1
  • 10
  • 33
-1
votes
1 answer

How to convert string "email addresses always use @ sign" into UTF 16 code

I am not getting proper method to convert string "Email addresses always uses @ sign" to UTF 16 code And also want to know Is it same as that of ASCII code?
-1
votes
1 answer

UTF16/32 Test Case (Need Negative Test Case)

I want/need a test case for testing/breaking conversions between UTF-32 and UTF-16. For UTF-8 and UTF-16, I generally use the 'Chinese Bone' test: 0xE9 0xAA 0xA8 (UTF8) and 0x9AA8 (UTF16). Does anyone have a negative test case that should break a…
jww
  • 97,681
  • 90
  • 411
  • 885
-1
votes
1 answer

How to create UTF-16 animation in Twitter?

I use a UTF-16 character picker to create ASCII art in Texbox in HTML, and UTF-16 characters are supported and visible "as is". Now I need to process such ASCII art and save into an Array as UTF-16 characters, process with Javascript as Strings to…
-1
votes
1 answer

Convert UTF 8 stream to UTF 16 encoding

I get a stream in UTF8 format, But i would like to get it in UTF16 format as i get some unsupported international characters in C#. How do i achieve this
user183781
  • 31
  • 5
-2
votes
1 answer

Issue in reading UTF-16 text file using Pyspark

I am trying to read UTF-16 file using pyspark dataframe. While reading, if there is a space in the file, it is showing as box while displaying using df.display(). How to read this properly? df = spark.read.option("delimiter","|") \ …
Rathesh
  • 1
  • 3
-2
votes
1 answer

Changing utf encoding in visual studio (C++) to UTF-16

I am making a C++ console program in Visual Studio and I want to use some characters to output to console via std::cout or other method that exists in the UTF-16 encoding, such as ą, ė, ų, ž, etc. Is it possible to change the UTF-8 encoding into…
user19964717
-2
votes
3 answers

What is this hexadecimal in the utf16 format?

print(bytes('ba', 'utf-16')) Result : b'\xff\xfeb\x00a\x00' I understand utf-16 means every character will take 16 bits means 00000000 00000000 in binary and i understand there are 16 bits here x00a means x00 = 00000000 and a = 01000001 so both…
user20144486
-2
votes
1 answer

How is UTF-16 converting string?

b'\x14\xfeh\x00e\x00l\x00l\x00o\x00 \x00w\x00o\x00r\x00l\x00d\x00' I understand that UTF-16 uses 16 bits but what confuses me the most is that 16 bits is two characters, so why do I see a long line of hexadecimal characters? It should be like for…
user20144486
-2
votes
2 answers

How does UTF-16 encoding works?

Today I was learning about Character Encoding and Unicode but there is one thing I'm not sure about. I used this website to change 字 to Unicode 101101101010111 (which from my understanding is a character set) and same symbol to UTF-16 (a Character…
user18618593
-2
votes
1 answer

16-bit encoding that has all bits mapped to some value

UTF-32 has its last bits zeroed. As I understand it UTF-16 doesn't use all its bits either. Is there a 16-bit encoding that has all bit combinations mapped to some value, preferably a subset of UTF, like ASCII for 7-bit?
J Alan
  • 77
  • 1
  • 11
-2
votes
1 answer

Manipulating a single wide char

quick question, porting my 15k line framework to UTF16 :) do i manipulate single wchar_t's like this? wchar_t Help[128]; Help[0] = '?' Help[1] = '/0' or wchar_t Help[128]; Help[0] = L'?' Help[1] = L'/0'
BingBang32
  • 557
  • 1
  • 4
  • 9
-2
votes
2 answers

String.startswith fails when comparing UTF-16 string to literal

I have an Unicode ("Windows Notepad Unicode" or UTF-16LE) text file from which I read line like this: FileInputStream is = new FileInputStream(cmdFile); BufferedReader reader = new BufferedReader(new InputStreamReader(is, "UTF-16LE")); …
Janeks Bergs
  • 224
  • 3
  • 13
-2
votes
1 answer

Libreoffice launches CSV window set to UTF-16 on UTF-8 file

Just installed Fedora 24 (Mate Spin). When LibreOffice 5 is directed to open a CSV, the dropdown is set to utf16. How can I set the dropdown to default as UTF-8?
Ed Greenberg
  • 209
  • 3
  • 12
-2
votes
1 answer

Converting UTF16(Windows wchar_t) to UTF8 in C++ Non-English letters corrupted(Korean)

I'm trying to make a multiplatform app. On the Windows Store App(winrt) side, open a file and read its path in Platform::String format which is wchar_t, UTF16 in Windows. Since my core logic is platform independent and only use standard C++ data…
legokangpalla
  • 495
  • 5
  • 20
1 2 3
79
80