0

I have C++ code which converts string that contains (ALCHEMICAL SYMBOL FOR AQUA REGIA) to u16string:

#include <string>
#include <codecvt>
#include <locale>
using namespace std;

int main() {
    setlocale(LC_ALL, "ru_RU.UTF-8");
    string s = "";

    wstring_convert<codecvt_utf8<char16_t>, char16_t> converter;
    u16string s16 = converter.from_bytes(s);
    return 0;
}

Note, that I don't use wstring or any istream.

This gives me std::range_error:

terminate called after throwing an instance of 'std::range_error'
  what():  wstring_convert::from_bytes

But on ideone this code runs without error.

I receive error with both g++ 7.2.0 and clang 4.0.1, when compiling with -std=c++14.

Why there is no error on ideone and why I receive this error?


I have arch linux 4.12.13 and locale command gives me:

LANG=ru_RU.UTF-8
LC_CTYPE="ru_RU.UTF-8"
...
LC_IDENTIFICATION="ru_RU.UTF-8"
LC_ALL=
diralik
  • 6,391
  • 3
  • 28
  • 52
  • Did you really intend to convert a character that looks like a hollow square? Because I see a hollow square character. – n. m. could be an AI Sep 17 '17 at 10:23
  • @n.m. sorry, i add details about this character – diralik Sep 17 '17 at 10:43
  • Using non-base-charset characters in the source code is fragile. Nobody knows what your editor or compiler will do. A portable way to specify extended characters is via their universal character names. – n. m. could be an AI Sep 17 '17 at 10:49
  • @n.m. you are right, in real code i receive text containing this character from server. Here this character is inside code because of simplicity – diralik Sep 17 '17 at 10:55
  • Get rid of `setlocale()` (it doesn't apply in this code), and use `string s = u8"";` to ensure `s` is properly UTF-8 encoded. – Remy Lebeau Sep 18 '17 at 05:39

1 Answers1

4

Isn't this what your converter should be like?

std::wstring_convert<std::codecvt_utf8_utf16<char16_t>, char16_t> converter;

Kaveh Vahedipour
  • 3,412
  • 1
  • 14
  • 22