Questions tagged [utf-16]

UTF-16 is a character encoding that represents Unicode code points using either 2 or 4 bytes per character.

UTF-16 is a character encoding that describes code points in byte sequences of either two or four bytes. It is therefore a variable-width character encoding.

The algorithm for encoding code points as UTF-16 is described in RFC 2781.

There are three flavors of UTF-16, little-endian, big-endian and with BOM (see ).

Related tags

1193 questions
0
votes
2 answers

mySQL VARCHAR(256) + mySQL INT = how many bytes?

CREATE SCHEMA IF NOT EXISTS `utftest` DEFAULT CHARACTER SET utf16; CREATE TABLE IF NOT EXISTS `metadata_labels` (`metadata_id` INT NOT NULL , `label` VARCHAR(256) NOT NULL , PRIMARY KEY (`metadata_id`, `label`)); however I get the following…
Saqib Ali
  • 3,953
  • 10
  • 55
  • 100
0
votes
4 answers

working with UTF-8 encoded text

I have a problem. I need to find some utf-8 characters from my text file and output them, but it doens't output the letters, instead it outputs "?", questionmarks... ini_set( 'default_charset', 'UTF-8' ); $homepage =…
Hurrem
  • 193
  • 2
  • 4
  • 15
0
votes
1 answer

How to remove broken characters present in ISO encoded XML file

When I tried to convert the xml file with a UTF-16 encoding to ISO-8859-1, I am seeing broken characters like Â. Can you please suggest some solution to remove the broken characters? I want the XML in an ISO encoded format. Here is my code, using…
yamuna
  • 1
  • 2
0
votes
2 answers

Decode Unicode to character in javascript

I have the following unicode sequence: d76cb9dd0020b370b2c8c758 I tried randomly in non-English character (for this experiment, I tried korean languange) as the original of above unicode lines : 희망 데니의 How can i decode…
Doni Andri Cahyono
  • 793
  • 5
  • 16
  • 28
0
votes
1 answer

c++ UTF-16 ofstream file creation Windows

Possible Duplicate: How to open an std::fstream (ofstream or ifstream) with a unicode filename? I have a string encoded in UTF-16 and I want to create a file, where the name of the file would be this string. UTF-16LE string looks like: At first…
Tebe
  • 3,176
  • 8
  • 40
  • 60
0
votes
3 answers

How can I identify different encodings without the use of a BOM?

I have a file watcher that is grabbing content from a growing file encoded with utf-16LE. The first bit of data written to it has the BOM available -- I was using this to identify the encoding against UTF-8 (which MOST of my files coming in are…
eyberg
  • 3,160
  • 5
  • 27
  • 43
0
votes
0 answers

boost spirit and char16_t / UTF-16 support?

Is there any support, today or in the near future, for char16_t / UTF-16 in boost spirit? I did try the word count lexer example using char16_t but ran into al sorts of compile errors. Thanks, Henry Roeland
Henry Roeland
  • 492
  • 5
  • 19
0
votes
1 answer

What is the easiest way to search and replace in text files encoded UTF-16?

I'm trying to update a series of xml files by changing names that they reference. I have a table of names that have changed, column for the current name and a column for the name to replace with. I looked for ways to script search and replace and…
StarkRavingSage
0
votes
1 answer

RegEx for Unicode strings to check the string does not contain specific characters

Basically, I need to check that a utf-16 string does not contain these characters /:*?<>|+. Apart from them, it can contain any character from English to Latin. For normal ASCII strings, we would write a RegEx something like ^[^\/:?<>|+]$ How does…
0
votes
2 answers

codeigniter, converting to utf 16 charset, form validation wrong length

hey guys was hoping you could help me out. I am required to make a website coded in php+codeigniter to work with utf 16 charset. So to convert it, I have converted the database.php settings to: $db['default']['char_set'] =…
Ahmed-Anas
  • 5,471
  • 9
  • 50
  • 72
0
votes
1 answer

how to output ucs2_unicode data with php

I am stuck on a project that requires foreign language characters. I need to input these and store them in the database as well as output them from the database to the screen. For example, I have this string Kupon obuhvaća: that shows as Kupon…
Ahmed-Anas
  • 5,471
  • 9
  • 50
  • 72
0
votes
1 answer

Can we change XML encoding from utf-8 to utf -16?

I have written a code for generating XML with UTF-8 encoding.I always validate the XML with XSD file. In the same code i need UTF-16 encoding. Because one of my XSD file is of UTF-16 encoding. But in my existing code it is not accepted. it gives…
Sumit Munot
  • 3,748
  • 1
  • 32
  • 51
0
votes
1 answer

Error parsing XML

I am trying to parse the string contents of an XML file that contains special characters in to an XDocument for further processing when I keep getting the following error: Name cannot begin with the '.' character, hexadecimal value 0x00. Line 1,…
Cranialsurge
  • 245
  • 5
  • 13
0
votes
1 answer

Replacing low ASCII characters in UTF-16-encoded string using PHP's str_replace function

I have some PHP code that I use for text filtering. During filtering, some ASCII characters such as ampersand (&) and tilde (~) are temporarily converted to low ASCII characters (such as decimal code-points 4 and 5). Just before the final filtered…
user594694
  • 327
  • 4
  • 13
0
votes
1 answer

UTF-16 to String in Java

Let's say I have the word "Sample". In UTF-16 BE, this is represented as 00 53 00 61 00 6D 00 70 00 6C 00 65. When I have this, I would like to convert it back to "Sample" using Java. How do I do this?
serverfaces
  • 1,155
  • 4
  • 22
  • 49