I have tried searching stackoverflow to find an answer to this but the questions and answers I've found are around 10 years old and I can't seem to find consensus on the subject due to changes and possible progress.
There are several libraries that I know of outside of the stl that are supposed to handle unicode-
- http://userguide.icu-project.org/
- https://github.com/nemtrif/utfcpp
- https://github.com/CaptainCrowbar/unicorn-lib
There are a few features of the stl (wstring,codecvt_utf8) that were included but people seem to be ambivalent about using because they deal with UTF-16 which this site: (utf-8 everywhere) says shouldn't be used and many people online seem agree with the premise.
The only thing I'm looking for is the ability to do 4 things with a unicode strings-
- Read a string into memory
- Search the string with regex using unicode or ascii, concatenate or do text replacement/formatting with it with either ascii+unicode numbers or characters.
- Convert to ascii + the unicode number format for characters that don't fit in the ascii range.
- Write a string to disk or send wherever.
From what I can tell icu handles this and more. What I would like to know is if there is a standard way of handling this on Linux, Windows, and MacOS.
Thank you for your time.