4

My team is tasked with implementing Unicode in our software, which is well over a million lines of code. We support an MFC Client and a Server on Windows, AIX or Solaris with an Oracle or SQL Server database. ICU looks like a very helpful tool. What are the pros and cons of using ICU? Does ICU work as advertised without major bugs?

Steven R. Loomis
  • 4,228
  • 28
  • 39
k73
  • 43
  • 5
  • Yes, ICU works; I've relied on it for years. "Pros and cons" is an opinion question, which is generally considered out of bounds on StackOverflow. – keshlam Feb 19 '14 at 06:34

2 Answers2

6

A data point: Our (yes, that's a disclaimer) list of users and bugs is all on our project site.

IMBO (biased): Pros:

  • works as advertised, comprehensive.
  • Mature: 10+ years now, with a good stability policy and very active development.
  • Uses latest Unicode+CLDR+BCP47+other standards.
  • Compiles basically everywhere. C/C++/J and called by/implements python,perl,php,…
  • Open source, with an increasing diversity of contributors.
  • Comes with all needed data for the above (see below, under cons), yet customizable. (can add custom data)

Cons:

  • Needs better documentation (we try- anyone want to help?).
  • Lots of APIs- "it's too big #1" hard to know which one to use, even if it does what you want.
  • Used by lots of types of programs, from embedded devices, smartphones through major desktop apps through databases and operating systems and enterprise apps: So, there may be multiple ways to do something.
  • Comes with all needed data for the above! "it's too big #2" (see above, under pros), yet customizable. (can be trimmed down to size)
Steven R. Loomis
  • 4,228
  • 28
  • 39
1

ICU is terrible: avoid if at all possible.

  • Despite its age, basic things in it are broken, for example in this question: Fixing regex to work around ICU/RegexKitLite bug

  • Time handling is broken as times are underspecified: you can't distinguish a DST from a non-DST time in a reliable way in many APIs.

  • It's freaking huge.

  • The documentation needs a lot of work. Less-used features are often unusable because there's no way to figure out the right way to use them. I spent days trying to get transliteration to work as explained and eventually gave up.

  • It likes to work in UTF-16, the worst of all possible worlds.

  • Support is unresponsive to problems.

  • In my experience, it's not until you're most of the way through a project that you begin to discover the insidious flaws that will take 90% of your time.

For many people, there is no alternative so you're stuck with it.

Community
  • 1
  • 1
George
  • 4,189
  • 2
  • 24
  • 23
  • 1
    the 'basic thing' is not a bug. Follow the link and the bug. Did you file a bug or send a support request about transliteration? ICU is huge because of the feature and *data* set, but can be made smaller in well-documented ways. Did you file a bug on DST/non-DST? - I'm not sure I understand what's being requested. Reading SO isn't part of our official support, normally. – Steven R. Loomis Mar 08 '11 at 20:39