0

how can I do basic string operations such as strcat, strlen and ... on UTF-8 string with ICU library in C.

I found lots of functions for UTF-16 but not for UFT-8.

user335870
  • 588
  • 2
  • 11
  • 22

2 Answers2

0

ICU supplies a set of C macros and functions for working with UTF-8 encoded strings. Start here: http://userguide.icu-project.org/strings/utf-8

Nemanja Trifunovic
  • 24,346
  • 3
  • 50
  • 88
  • Yes, I know, I have seen this page, but the point is, I couldn't find strlen and other functions. where can i find a complete list of UTF-8 based functions? – user335870 Dec 02 '15 at 14:05
  • You want bytes length or count of characters? Either way, you need to first deal with normalization or not. – uchuugaka Jan 13 '16 at 15:50
0

You can use regular strcat and similar functions with UTF-8. As for length, if you mean to count characters you can do mbstowcs(NULL, s, 0).

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • 2
    mbstowcs if for converting other encodings such as ANSI to wide char. not for getting length of string. it only return the required space for creating new UTF-16 string. – user335870 Dec 02 '15 at 14:07
  • 1
    @user335870: Have you tried the code I wrote before saying it doesn't work? – John Zwinck Dec 03 '15 at 04:05