First time posting here, so apologies in advance if my Title / formatting / tags are not how they are supposed to be.
I am trying to create a function in a c++ windows console application, which will remove diacritics from an std::wstring
user input. To do so, I'm using a code created with help from this question as well as converting my wstring to an UTF-8 string as follows:
std::string test= wstring_to_utf8 (input);
std::string wstring_to_utf8 (const std::wstring& str){
std::wstring_convert<std::codecvt_utf8<wchar_t>> myconv;
return myconv.to_bytes(str);
}
std::string output= desaxUTF8(test);
with desaxUTF8(...) being:
#include <unicode/utypes.h>
#include <unicode/unistr.h>
#include <unicode/ustream.h>
#include <unicode/translit.h>
#include <unicode/stringpiece.h>
std::string desaxUTF8(const std::string& str) {
StringPiece s(str);
UnicodeString source = UnicodeString::fromUTF8(s);
//...
return result;
}
Here is where i run into a problem. The StringPiece s
does not properly receive value from the string str
, but instead gets set to an incorrect value.
But if i were to replace StringPiece s(str);
with a hard coded value, say StringPiece s("abcš");
, it works perfectly fine.
Using the VS2015 debugger, the value on StringPiece s
for an user input abcš
is an incorrect 0x0028cdc0 "H\t„"
, while the value for a hard coded abcš
is the correct 0x00b483d4 "abcš"
What am i doing wrong, and what is the best way to fix this? I have already tried the recommended solutions from this thread.
I've spent the last two days trying to find a solution to no avail, so any help would be greatly appreciated.
Thank you in advance.
Post answer EDIT: For anyone that is interested, here is the working code, with massive thanks to Steven R. Loomis for making it happen;
std::wstring Menu::removeDiacritis(const std::wstring &input) {
UnicodeString source(FALSE, input.data(), input.length());
UErrorCode status = U_ZERO_ERROR;
Transliterator *accentsConverter = Transliterator::createInstance(
"NFD; [:M:] Remove; NFC", UTRANS_FORWARD, status);
accentsConverter->transliterate(source);
std::wstring output(source.getBuffer(), source.length());
return output;
}