as I know linux uses UTF-8 encoding.
This means I can use std::string
for handling string right?
Just the encoding will be UTF-8.
Now on UTF-8 we know some characters are 1 byte some 2,3.. bytes. My question is: how to you deal with UTF-8 encoded string on Linux using C++?
Particularly: how would you get length of string say in bytes (or number of characters)? How would you traverse the string? etc.
The reason I am asking is that as I said on UTF-8 characters may be more than one byte right?
So obviously myString[7]
and myString[8]
- might not refer to two different characters.
Also fact that UTF-8 string is ten bytes, doesn't say much about its number of characters right?