Questions tagged [icu]

The International Components for Unicode is a set of cross-platform open source C/C++ and Java libraries for Unicode and globalization support. It is developed by the Unicode Consortium. Use this tag when you have a question about using the ICU. Be sure to include the appropriate language tag and the version of ICU that you are using.

The International Components for Unicode supports C/C++ (ICU4C) and Java (ICU4J), and is available under an open source license. See the ICU-TC home page for documentation and licensing details.

It uses .

771 questions
0
votes
2 answers

Character length in bytes

Given first byte(of a multi-byte character) and charset canonical name, how to determine byte-length of a character? Best would be using ICU library.
Michal
  • 2,078
  • 19
  • 29
0
votes
1 answer

ICU Big Endian Strings

I want to convert a UnicodeString using ICU to a UTF-16 string, encoded in big-endian, for sending through a socket. The big-endian thing is messing me up. I can't seem to find any resources relating to the topic with ICU and C++. Using C#, I could…
Dylan
  • 572
  • 5
  • 10
0
votes
1 answer

ICU characters don't match when using UNICODE_STRING_SIMPLE

I need to understand the behavior of UNICODE_STRING_SIMPLE macro when using '\uhhhh' I have the following code: cout<<"Char print out for À"<
Dula
  • 1,404
  • 1
  • 14
  • 29
0
votes
1 answer

Building icu4c fails due to missing findBasename

I try to cross-compiler icu4c 51.1 with Emscripten (clang) via emconfigure source/configure --host=i386-pc-linux-gnu --prefix=${myprefix} --enable-static --disable-shared --disable-dyload --disable-renaming CXXFLAGS='-DU_USING_ICU_NAMESPACE=0'…
abergmeier
  • 13,224
  • 13
  • 64
  • 120
0
votes
1 answer

Localizing apps for non-standard locales

I am writing an app which should support a language which is not in the list of available (40?) languages on the device, but the region format is. I want to localize the name of days as well as "yesterday...". The problem is, that days (Monday,...)…
Chuckels5584
  • 551
  • 2
  • 6
  • 17
0
votes
1 answer

symbol lookup error: undefined symbol: UCNV_FROM_U_CALLBACK_ESCAPE_44

I am getting this error when I try to run my project with the lib webkit shared library The whole error is symbol lookup error: ./lib/libwebkit-1.0.so.2: undefined symbol: UCNV_FROM_U_CALLBACK_ESCAPE_44 I just want to look for what might cause…
user1655072
  • 572
  • 2
  • 10
  • 20
0
votes
1 answer

C++: various encodings getters like in Python

My question is about encodings under C++. When I've worked with Python, I've sometimes had to use such encodings for various…
ghostmansd
  • 3,285
  • 5
  • 30
  • 44
0
votes
3 answers

Changing the pattern for CHF currency in Java using icu package?

I wrote a java function that shows the locale pattern for each currency. See the function below. What I am interested to know is that why when the currency is CHF, the 2nd decimal is hardcoded to 5? Note that I am using icu package and this issue…
Wael
  • 1,533
  • 4
  • 20
  • 35
0
votes
0 answers

Changing the pattern for CHF currency in Java? (using ICU)

I wrote a java function that shows the locale pattern for each currency. See the function below. What I am interested to know is that why when the currency is CHF, the 2nd decimal is hardcoded to 5? Here is the output of the function which is…
Wael
  • 1,533
  • 4
  • 20
  • 35
0
votes
1 answer

How to point configure down the right path to icu

I'm a newcomer to macs and terminal in particular, so this may be a stupid question. I'm trying to run ./configure in mysrc/myfreeling so that I can install freeling but I get this error: checking for main in -licuuc... no Unicode ICU library not…
Dr. Agon
  • 87
  • 9
0
votes
0 answers

Is the ICU library corrupted?

I am getting the following output for the nm –u for a ICU library. Does it mean this library is corrupted? While using this library some of our code take 20 mins to complete the execution. I'm suspecting it is because of this library is corrupted…
Abinash Bishoyi
  • 187
  • 1
  • 2
  • 13
0
votes
2 answers

Word Segmentation using ICU

I am using ICU4C to transliterate CJK. I am wondering whether it is possible to have word segmentation in ICU, to split Chinese text into a sequence of words, defined according to some word segmentation standard. When I try transliterating for…
mrz
  • 1,802
  • 2
  • 21
  • 32
0
votes
1 answer

Transliterate CJK to Latin -- preferably in C++

I am trying to write a program that can transliterate CJK to Latin (i.e Pinyin, Romaji, etc.). For example you give a Chinese, Japanese or Korean document as input and then you get the transliterated version into Latin as output. I am new in this…
mrz
  • 1,802
  • 2
  • 21
  • 32
0
votes
1 answer

Convert a string using icu4c?

How can I convert a string from one encoding to another using icu4c? I've searched for hours and have not found anything except from looking at the source of uconv (a utility bundled with icu4c), it calls ucnv_open (which takes encodings as…
megazord
  • 3,210
  • 3
  • 23
  • 31
0
votes
1 answer

Why isn't transliteration from Urdu to Latin working properly in icu4c?

I'm trying to convert this line of Urdu to Latin, and then converting the Latin characters to ASCII (by removing the accents, etc) but it seems to be missing some characters. Why are there non latin characters ("ہ", "ے", etc) still there? $ uconv…
megazord
  • 3,210
  • 3
  • 23
  • 31