Questions tagged [icu]

The International Components for Unicode is a set of cross-platform open source C/C++ and Java libraries for Unicode and globalization support. It is developed by the Unicode Consortium. Use this tag when you have a question about using the ICU. Be sure to include the appropriate language tag and the version of ICU that you are using.

The International Components for Unicode supports C/C++ (ICU4C) and Java (ICU4J), and is available under an open source license. See the ICU-TC home page for documentation and licensing details.

It uses .

771 questions
0
votes
1 answer

Choosing encoding for icu::UnicodeString

I found myself in need of a way to change a string to lower case that was safe to use for ASCII and for UTF16-LE (as found in some windows registry strings) and came across this question: How to convert std::string to lower case? The answer that…
Matthew
  • 685
  • 8
  • 18
0
votes
2 answers

ICU and basic opreation on UTF-8 strings

how can I do basic string operations such as strcat, strlen and ... on UTF-8 string with ICU library in C. I found lots of functions for UTF-16 but not for UFT-8.
user335870
  • 588
  • 2
  • 11
  • 22
0
votes
0 answers

\u0097 character not displayed in RStudio's console or Knitr outputs

I'm working on Windows 7 with RStudio. encoding ISO-8859-1 (System default) I have got a text file containing long dashes (END OF GUARDED AREA character) — 1920 — né à Paris — France — Here the result in Knitr outputs 1920 né à…
Wilcar
  • 2,349
  • 2
  • 21
  • 48
0
votes
0 answers

Android support of dictionary based BreakIterator

I have the following same code run in API level 16 vs API level 21, and I found that in API level 16, the dictionary based iterator (tokenizer) seems not working, while in API level 21, the dictionary based iterator is working…
Gordon Liang
  • 348
  • 2
  • 11
0
votes
1 answer

ICU Pivot Buffer in ucnv_convertEx

The converter ucnv_convertEx() in the ICU library optionally requires a 'pivot' buffer. The correspondent pointers may be set to NULL, but what is actually the purpose of that? Is it faster? Is it not enough to not increment the source pointer…
Frank-Rene Schäfer
  • 3,182
  • 27
  • 51
0
votes
1 answer

How to determine whether a codec has a const ratio 'byte number per character'?

Given an IANA codec name (or, one of those used in iconv/ICU), what is the easiest way to determine whether the codec has a fixed width representation for characters or not?
Frank-Rene Schäfer
  • 3,182
  • 27
  • 51
0
votes
1 answer

Defining the bounds for a regex negative look behind with ICU

I have patterns like the following header line a = b c = d c = e f = g I've come up with the pattern std::string pat = "((.*)(\n|\r\n)(\\s|\\t)*?(?([a-z]{1,100}))\\s+=)" "((.*)(\n|\r\n)(\\s|\\t)*?(?{1,100})\\s+=)"; Using…
zcourts
  • 4,863
  • 6
  • 49
  • 74
0
votes
1 answer

C2248 error while porting ICU VS2013 solution to VS2015

I am trying to port ICU 4.0 VS2013 solution to VS2015, while porting I get following error Severity Code Description Project File Line Error C2248 'icu_4_0::Format::Format': cannot access protected member declared in class…
BitMask
  • 314
  • 2
  • 9
0
votes
0 answers

Build static Qt with icu and webkit using MinGW on Windows

I want build static Qt with webkit support. I compiled icu with the help of this manual. But I can't compile it. I am getting a lot of errors like this: mingw32-make[3]: Entering directory…
0
votes
2 answers

Which case folding types does U_FOLD_CASE_DEFAULT refer to?

I'm implementing case folding in a function for JavaScript. I've code generated out of the CaseFolding.txt file the needed JavaScript code. The documentation for U_FOLD_CASE_DEFAULT says that ICU will use: default mappings defined in…
Daniel A. White
  • 187,200
  • 47
  • 362
  • 445
0
votes
1 answer

How to extend ICU collation with custom rules in SQLite?

I want to use ICU collations in SQLite for sorting the results of SQL queries. I have customized collation rules which I want to load on top of the default rules for a specific locale. So I first want to load the collation for a locale and then I…
J. R.
  • 43
  • 4
0
votes
0 answers

ICU extract function produces non printable characters

I'm receiving a Unicode string from a vendor. I believe they are using UTF-16. I need to convert it to UTF-8 so I can manipulate in my code. I'm using ICU to handle the Unicode. When trying the toUTF8String method, i'm getting characters that…
0
votes
1 answer

Android sqlcipher case insensitive LIKE for unicode not working

I have sqlcipher library in my android application. As I undertood it includes ICU localization. For example I have table Users with 1 row: id name 1 Темп I executing query select * from Users where name like '%темп%' and result is empty. So my…
Orest
  • 6,548
  • 10
  • 54
  • 84
0
votes
3 answers

"undefined reference" while building webkitwidgets module into self-compiled Qt5 due to missing ICU libraries?

I have built Qt 5 libs from source into Ubuntu 12.04 and made a Qt test app. In .pro file I have written QT += core gui webkitwidgets to use Qwebview in my test project. But when I run qmake into Qt creator then it gives me error: Unknown module(s)…
User2546
  • 33
  • 1
  • 8
0
votes
1 answer

How to transliterate Tibetan to Latin?

Is it possible to transliterate Tibetan characters to Latin in PHP? It is not supported by ICU, which I was surprised about as ICU will transliterate very obscure languages like Dhiveli, with only around 200,000 speakers.
Kohjah Breese
  • 4,008
  • 6
  • 32
  • 48