Questions tagged [utf-16]

UTF-16 is a character encoding that represents Unicode code points using either 2 or 4 bytes per character.

UTF-16 is a character encoding that describes code points in byte sequences of either two or four bytes. It is therefore a variable-width character encoding.

The algorithm for encoding code points as UTF-16 is described in RFC 2781.

There are three flavors of UTF-16, little-endian, big-endian and with BOM (see ).

Related tags

1193 questions
4
votes
1 answer

Is the Unicode code point value equal to the UTF-16BE representation for every character?

I saved some strings in Microsoft Agenda in Unicode big endian format (UTF-16BE). When I open it with the shell command xxd to see the binary value, write it down, and get the value of the Unicode code point by ord() to get the ordinal value…
showkey
  • 482
  • 42
  • 140
  • 295
4
votes
4 answers

Python UTF-16 WAVY DASH encoding question / issue

I was doing some work today, and came across an issue where something "looked funny". I had been interpreting some string data as utf-8, and checking the encoded form. The data was coming from ldap (Specifically, Active Directory) via python-ldap.…
NoName
  • 125
  • 5
4
votes
0 answers

C++ Back and forth conversion between UTF8 and UTF16 using UTF8-CPP (Non codecvt code!)

I'm trying to make GWork (a fork of GWEN GUI) to compile with GCC and I need to be able to convert cross convert UTF-8 and UTF-16 strings.I've found the UTF8-CPP library and so far it looks perfect. Looking at the UTF8-CPP examples I notice that it…
SLC
  • 2,167
  • 2
  • 28
  • 46
4
votes
1 answer

How to use ICU with UTF-16?

I'm looking into using ICU for Unicode string processing in a native Node.js module because it seems to me that v8::String (according to these docs) doesn't have a C++ API for this purpose. To my knowledge V8 expects UTF-16 in ExternalStringResource…
Venemo
  • 18,515
  • 13
  • 84
  • 125
4
votes
2 answers

java decoding base64 String

I realise this is probably more of a general java question, but since it's running in Notes\ Domino environment, thought I'd check that community first. Summary: I don't seem to be able to decode the string: dABlAHMAdAA= using…
nick wall
  • 161
  • 2
  • 13
4
votes
1 answer

How to verify whether an instance of CharSequence is a sequence of Unicode scalar values?

I have an instance of java.lang.CharSequence. I need to determine whether this instance is a sequence of Unicode scalar values (that is, whether the instance is in UTF-16 encoding form). Despite the assurances of java.lang.String, a Java string is…
Nathan Ryan
  • 12,893
  • 4
  • 26
  • 37
4
votes
1 answer

Duplicate Windows Cryptographic Service Provider results in Python w/ Pycrypto

Edits and Updates 3/24/2013: My output hash from Python is now matching the hash from c++ after converting to utf-16 and stoping before hitting any 'e' or 'm' bytes. However the decrypted results do not match. I know that my SHA1 hash is 20 bytes…
patmo141
  • 321
  • 1
  • 3
  • 12
4
votes
5 answers

UTF-16 to ASCII conversion in Java

Having ignored it all this time, I am currently forcing myself to learn more about unicode in Java. There is an exercise I need to do about converting a UTF-16 string to 8-bit ASCII. Can someone please enlighten me how to do this in Java? I…
His
  • 5,891
  • 15
  • 61
  • 82
4
votes
2 answers

UTF-16 perl input output

I am writing a script that takes a UTF-16 encoded text file as input and outputs a UTF-16 encoded text file. use open "encoding(UTF-16)"; open INPUT, "< input.txt" or die "cannot open > input.txt: $!\n"; open(OUTPUT,">…
allenylzhou
  • 1,431
  • 4
  • 19
  • 36
4
votes
1 answer

Use UTF-8 charset in Google Apps Script

I use the following in a script: var JSONResult = Maps.newGeocoder().geocode(member.address); var AddressFormatted= JSONResult.results[0].formatted_address; and the result sometimes look like Rue de Cognel��e I would like to force to have the…
Vincent
  • 137
  • 3
  • 7
4
votes
1 answer

Converting MySQL database to UTF16

I am trying to create this table in a MySQL database CREATE TABLE IF NOT EXISTS `Scania` ( `GensetType` text CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL, `EngineType` text CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL, …
Yoosuf
  • 882
  • 7
  • 33
  • 52
3
votes
2 answers

How to use Boost Spirit to parse Chinese(unicode utf-16)?

My program does not recognize Chinese. How to use spirit to recognize Chinese? I use wstring and has convert it to utf-16. Here is my header file: #pragma once #define BOOST_SPIRIT_UNICODE #include #include…
Vapor
  • 43
  • 5
3
votes
2 answers

RSA in C# does not produce same encrypted string for specific keys?

I have a requirement, where I need to encrypt my connection string in one application and decrypt it in another. With this in mind, I save the public key and private keys in App.Config of the application respectively. Now, shouldn't RSA should give…
Nagaraj Tantri
  • 5,172
  • 12
  • 54
  • 78
3
votes
1 answer

VBA Output to file using UTF-16

I have a very complex problem that is difficult to explain properly. There is LOTS of discussion about this across the internet, but nothing definitive. Any help, or better explanation than mine, is greatly appreciated. Essentially, I'm just…
Alex McMillan
  • 17,096
  • 12
  • 55
  • 88
3
votes
3 answers

Designing an application for UTF-8 or UTF-16 usage

I am developing an application that will be primarily used by English and Spanish readers. However, in the future I would like to be able to support more extended languages, such as Japanese. While thinking of the design of the program I have hit a…
chadb
  • 1,138
  • 3
  • 13
  • 36