Questions tagged [utf-16]

UTF-16 is a character encoding that represents Unicode code points using either 2 or 4 bytes per character.

UTF-16 is a character encoding that describes unicode code points in byte sequences of either two or four bytes. It is therefore a variable-width character encoding.

The algorithm for encoding code points as UTF-16 is described in RFC 2781.

There are three flavors of UTF-16, little-endian, big-endian and with BOM (see endianness).

Related tags

The unicode character set it serializes
Other UTFs: utf-8 utf-16, utf-32, rarely used: utf-7 utf-1 utf-18 utf-36

1193 questions

votes

4 answers

How can I convert UTF-8 to UTF-16 in Excel VBA?

As far as I know, Excel use UTF-16 to represent string literals. I read from a console (Mac) / file (Windows), and in both cases the character encoding is messed up. I have to find a solution which works on both platforms, so ADO stream is not an…

excel vba utf-8 utf-16

asked Oct 28 '20 at 20:33

Attila

votes

3 answers

Is it possible to set a text file to UTF-16?

My code for writing text works for ANSI characters, but when I try to write Japanese characters they do not appear. Do I need to use UTF-16 encoding? If so, how would I do it on code? std::wstring filename; std::wstring text; filename =…

c++ windows unicode text-files utf-16

asked Sep 21 '20 at 19:54

Guilherme Galdino

votes

0 answers

How to make char16_t acceptable as a template parameter to basic_ifstream?

I am using C++17 on macOS and char16_t is not acceptable as a template parameter as follow: basic_ifstream file("c:\\file.txt", ios_base::ate); streamsize size = file.tellg(); file.seekg(0, ios_base::beg); u16string str(size/2,…

c++ file utf-16

asked Aug 09 '20 at 00:02

Lion King

32,851
25
81
143

votes

2 answers

Boost libraries for UTF-16 strings?

Are there any boost libraries to help with UTF-16 (or higher) strings?

c++ boost utf-16 utf

asked Jun 05 '11 at 10:40

Paul Manta

30,618
31
128
208

votes

1 answer

Is UTF-16 a superset of ASCII? If yes, why is UTF-16 incompatible with ASCII according to the HTML Standard?

According to the Wikipedia article on UTF-16, "...[UTF-16] is also the only web-encoding incompatible with ASCII." (at the end of the abstract.) This statement refers to the HTML Standard. Is this a wrong statement? I'm mainly a C# / .NET dev, and…

c# html .net ascii utf-16

asked May 17 '20 at 07:05

feO2x

5,358
2
37
46

votes

3 answers

How can I get the hex value of an input string using C++?

I just started working with C++, after a few weeks I figured out that C++ doesn't support a method or library to convert a string to Hexa value. Currently, I'm working on a method that will return the hexadecimal value of an input string encode in…

c++ utf-16

asked Nov 08 '19 at 09:38

Nguyễn Đức Tâm

1,017
2
10
24

votes

1 answer

Why Unicode code points are always written with at least 2 bytes?

Why does Unicode code points are always written with 2 bytes (4 digits) even when that's not necessary ? From the Wikipedia page about UTF-8 : $ -> U+0024 ¢ -> U+00A2

unicode encoding utf-8 utf-16

asked Aug 27 '18 at 14:59

Radioreve

3,173
3
19
32

votes

3 answers

Split UTF-16 String into single chars/strings

I have string that looks like this abc and I want to split it to single chars/strings. static List split(String text ) { List list = new ArrayList<>(text.length()); for(int i = 0; i < text.length() ; i++) { …

java utf-16

asked Jul 05 '18 at 08:46

MAGx2

3,149
7
33
63

votes

2 answers

Reading UTF-16 file in c++

I'm trying to read a file which has UTF-16LE coding with BOM. I tried this code #include #include #include #include int main() { std::wifstream fin("/home/asutp/test"); …

c++ utf-16

asked Jun 05 '18 at 09:37

Kot Shrodingera

votes

2 answers

How can I use Mac OS X (and UNIX) command line tools like grep with UTF-16 files?

I have a bunch of text files I want to use with grep. They are all from an external source and are UTF-16 encoded and begin with a byte order mark. Unix tools like grep don't work on them for me. What work-around is there for this?

macos unix unicode grep utf-16

asked Jan 29 '11 at 09:38

Steve McLeod

51,737
47
128
184

votes

1 answer

Why does Windows use ANSI Code page instead of UNICODE?

When I run the command chcp in a cmd.exe window, it represents the code page used in Windows. I think Windows uses the UNICODE character set. So, my questions are: Why does Windows use ANSI codepages instead of Unicode? Windows uses UTF-16 or…

windows unicode encoding utf-16 ucs2

asked Oct 11 '17 at 00:20

JaeHyeok Kim

votes

3 answers

length of string in python3.5 with different encode

I tried this in python to get the length of a string in bytes. >>> s = 'a' >>> s.encode('utf-8') b'a' >>> s.encode('utf-16') b'\xff\xfea\x00' >>> s.encode('utf-32') b'\xff\xfe\x00\x00a\x00\x00\x00' >>> len(s.encode('utf-8')) 1 >>>…

python unicode utf-8 utf-16 byte-order-mark

asked Aug 09 '17 at 00:53

Z-Jiang

votes

0 answers

"UnicodeError: UTF-16 stream does not start with BOM" when opening file that apparently has a BOM

I have a project in which most of the files are UTF-16 but one is UTF-8. Having put the correct encoding ("utf_8" or "utf_16") into strOpenEncoding, I tried this: for strInput in open(strInputFileName, "r", newline="\n",…

python encoding utf-16 byte-order-mark

asked Mar 27 '17 at 15:16

Stephen

votes

2 answers

What is a safe length of JavaScript strings?

Considering charAt(), charCodeAt(), and codePointAt() I find a discrepancy between what the parameter means. Before I really thought about it I thought you would always be safe to access the character at length-1. But I read the difference between…

javascript arrays utf-8 utf-16

asked Mar 10 '17 at 02:02

Clive

votes

2 answers

UCS2 vs UTF. What languages can not be displayed in the UCS2 encoding?

UCS2 easier to use in Visual C++, than UTF encoding. What languages I can not support in UCS2 encoding?

visual-c++ unicode utf-16 ucs2

asked Nov 24 '10 at 13:32

KindDragon

6,558
4
47
75

Prev 1 2 3

…

79 80 Next