Questions tagged [utf-16]

UTF-16 is a character encoding that represents Unicode code points using either 2 or 4 bytes per character.

UTF-16 is a character encoding that describes unicode code points in byte sequences of either two or four bytes. It is therefore a variable-width character encoding.

The algorithm for encoding code points as UTF-16 is described in RFC 2781.

There are three flavors of UTF-16, little-endian, big-endian and with BOM (see endianness).

Related tags

The unicode character set it serializes
Other UTFs: utf-8 utf-16, utf-32, rarely used: utf-7 utf-1 utf-18 utf-36

1193 questions

votes

3 answers

How to convert Rust strings to UTF-16?

Editor's note: This code example is from a version of Rust prior to 1.0 and is not valid Rust 1.0 code, but the answers still contain valuable information. I want to pass a string literal to a Windows API. Many Windows functions use UTF-16 as the…

string rust utf-16

asked Aug 08 '14 at 06:36

Gigih Aji Ibrahim

votes

2 answers

java string.getBytes("UTF-8") javascript equivalent

I have this string in java: "test.message" byte[] bytes = plaintext.getBytes("UTF-8"); //result: [116, 101, 115, 116, 46, 109, 101, 115, 115, 97, 103, 101] If I do the same thing in javascript: stringToByteArray: function (str) { …

java javascript utf-8 byte utf-16

asked Apr 04 '14 at 11:40

user429620

votes

5 answers

Why doesn't Git natively support UTF-16?

Git supports several different encoding schemes, UTF-7, UTF-8, and UTF-32, as well as non-UTF ones. Given this, why doesn't it support UTF-16? There's a lot of questions that ask how to get Git to support UTF-16, but I don't think that this has been…

git utf-16

asked Sep 24 '18 at 03:47

Zac Faragher

votes

1 answer

Why can I not read a UTF-16 file longer than 4094 characters?

Some information: I've only tried this on Linux I've tried both with GCC (7.2.0) and Clang (3.8.1) It requires C++11 or higher to my understanding What happens when I run it I get the expected string "abcd" repeated until it hits the position of…

c++ linux utf-16 wstring wifstream

asked Aug 24 '17 at 20:59

Joakim L. Christiansen

1,379
11
15

votes

7 answers

findstr or grep that autodetects chararacter encoding (UTF-16)

I want to do this: findstr /s /c:some-symbol * or the grep equivalent grep -R some-symbol * but I need the utility to autodetect files encoded in UTF-16 (and friends) and search them appropriately. My files even have the byte-ordering mark…

unicode windows-xp windows-vista utf-16 findstr

asked Jan 02 '09 at 21:28

David Martin

votes

2 answers

Should I change from UTF-8 to UTF-16 to accommodate Chinese characters in my HTML?

I am using ASP.NET MVC, MS SQL and IIS. I have a few users that have used Chinese characters in their profile info. However, when I display this information is shows up as æŽå¼·è¯ but they are correct in my database. …

html utf-8 utf-16

asked Oct 05 '10 at 14:50

Aaron Salazar

4,467
10
39
54

votes

4 answers

Using unicode characters bigger than 2 bytes with .Net

I'm using this code to generate U+10FFFC var s = Encoding.UTF8.GetString(new byte[] {0xF4,0x8F,0xBF,0xBC}); I know it's for private-use and such, but it does display a single character as I'd expect when displaying it. The problems come when…

c# .net unicode char utf-16

asked May 29 '13 at 14:24

Earlz

62,085
98
303
499

votes

3 answers

Pandas read_csv and UTF-16

I have a CSV text file encoded in UTF-16 (so as to preserve Unicode characters when others use Excel) but when doing a read_csv with Pandas 0.9.0, I get this cryptic error: df =…

csv python-2.7 pandas utf-16

asked Dec 03 '12 at 19:18

Brian Keegan

2,208
4
24
31

votes

1 answer

Using iconv to convert from UTF-16BE to UTF-8 without BOM

I'm trying to convert a UTF-16BE encoded file (byte order mark: 0xFE 0xFF) to UTF-8 using iconv like so: iconv -f UTF-16BE -t UTF-8 myfile.txt The resulting output, however, has the UTF-8 byte order mark (0xEF 0xBB 0xBF) and that is not what I…

text utf-8 utf-16 iconv

asked Jul 20 '12 at 01:31

Edward Samson

2,395
2
26
39

votes

2 answers

R write.csv with UTF-16 encoding

I'm having trouble outputting a data.frame using write.csv using UTF-16 character encoding. Background: I am trying to write out a CSV file from a data.frame for use in Excel. Excel Mac 2011 seems to dislike UTF-8 (if I specify UTF-8 during text…

r unicode csv character-encoding utf-16

asked Mar 10 '11 at 23:15

Daniel Dickison

21,832
13
69
89

votes

3 answers

What is the difference between "UTF-16" and "std::wstring"?

Is there any difference between these two string storage formats?

c++ unicode stl utf-16

asked Nov 22 '10 at 15:46

hkBattousai

10,583
18
76
124

votes

3 answers

Why were the code points in the range of U+D800 to U+DFFF removed from the Unicode character set?

I am learning about UTF-16 encoding, and I have read that if you want to represent code points in the range of U+10000 to U+10FFFF, then you have to use surrogate pairs, which are in the range of U+D800 to U+DFFF. So let's say I want to encode the…

unicode encoding character-encoding utf-16

asked Oct 21 '16 at 20:22

paul

votes

3 answers

In UTF-16, UTF-16BE, UTF-16LE, is the endian of UTF-16 the computer's endianness?

UTF-16 is a two-byte character encoding. Exchanging the two bytes' addresses will produce UTF-16BE and UTF-16LE. But I find the name UTF-16 encoding exists in the Ubuntu gedit text editor, as well as UTF-16BE and UTF-16LE. With a C test program I…

c unicode endianness utf-16

asked Apr 11 '16 at 13:24

hao.zhou

votes

3 answers

dos2unix: Binary symbol 0x04 found at line 1703

I download a file from the OECD http://stats.oecd.org/Index.aspx?datasetcode=CRS1 ('CRS 2013 data.txt') by selecting Export-> Related files. I want to work with this file in Ubuntu (14.04 LTS). When I run: dos2unix CRS\ 2013\ data.txt I…

utf-16 byte-order-mark dos2unix

asked Apr 28 '15 at 15:11

dw8547

votes

1 answer

JSON.stringify() to UTF-8

Javascript uses as far as I know UTF-16 fundamentally as a standard for strings. With JSON.stringify() I can create a JSON string from an object. Is that JSON string UTF-16 encoded? Can I convert (hopefully fast) that string to UTF-8 to save…

javascript json utf-8 utf-16

asked Dec 02 '14 at 15:34

Sebastian Barth

4,079
7
40
59

Prev 1 2 3

…

79 80 Next