My team is tasked with implementing Unicode in our software, which is well over a million lines of code. We support an MFC Client and a Server on Windows, AIX or Solaris with an Oracle or SQL Server database. ICU looks like a very helpful tool. What are the pros and cons of using ICU? Does ICU work as advertised without major bugs?
-
Yes, ICU works; I've relied on it for years. "Pros and cons" is an opinion question, which is generally considered out of bounds on StackOverflow. – keshlam Feb 19 '14 at 06:34
2 Answers
A data point: Our (yes, that's a disclaimer) list of users and bugs is all on our project site.
IMBO (biased): Pros:
- works as advertised, comprehensive.
- Mature: 10+ years now, with a good stability policy and very active development.
- Uses latest Unicode+CLDR+BCP47+other standards.
- Compiles basically everywhere. C/C++/J and called by/implements python,perl,php,…
- Open source, with an increasing diversity of contributors.
- Comes with all needed data for the above (see below, under cons), yet customizable. (can add custom data)
Cons:
- Needs better documentation (we try- anyone want to help?).
- Lots of APIs- "it's too big #1" hard to know which one to use, even if it does what you want.
- Used by lots of types of programs, from embedded devices, smartphones through major desktop apps through databases and operating systems and enterprise apps: So, there may be multiple ways to do something.
- Comes with all needed data for the above! "it's too big #2" (see above, under pros), yet customizable. (can be trimmed down to size)

- 4,228
- 28
- 39
ICU is terrible: avoid if at all possible.
Despite its age, basic things in it are broken, for example in this question: Fixing regex to work around ICU/RegexKitLite bug
Time handling is broken as times are underspecified: you can't distinguish a DST from a non-DST time in a reliable way in many APIs.
It's freaking huge.
The documentation needs a lot of work. Less-used features are often unusable because there's no way to figure out the right way to use them. I spent days trying to get transliteration to work as explained and eventually gave up.
It likes to work in UTF-16, the worst of all possible worlds.
Support is unresponsive to problems.
In my experience, it's not until you're most of the way through a project that you begin to discover the insidious flaws that will take 90% of your time.
For many people, there is no alternative so you're stuck with it.
-
1the 'basic thing' is not a bug. Follow the link and the bug. Did you file a bug or send a support request about transliteration? ICU is huge because of the feature and *data* set, but can be made smaller in well-documented ways. Did you file a bug on DST/non-DST? - I'm not sure I understand what's being requested. Reading SO isn't part of our official support, normally. – Steven R. Loomis Mar 08 '11 at 20:39