With the default "C" locale only a-z get transformed by std::toupper()
as is documented for example here. Which characters exactly get transformed by std::ctype<CharT>::toupper()
with the default C++ locale?
I'm asking because std::toupper(L'ω', std::locale::classic())
returns L'Ω'
on Windows and I'm wondering for which other characters the C++ locale also returns an upper case form. In the "C" locale the same character is not transformed: static_cast<wchar_t>(std::towupper(static_cast<std::wint_t>(L'ω')))
returns L'ω'
as expected.
I used the following program to verify this:
#include <cwctype>
#include <fstream>
#include <locale>
int main()
{
std::wofstream fs("out.txt");
fs.imbue(std::locale("en_US.UTF8"));
fs << L"std::toupper(L'ω', std::locale::classic()): " << std::toupper(L'ω', std::locale::classic()) << std::endl;
fs << L"static_cast<wchar_t>(std::towupper(static_cast<std::wint_t>(L'ω'))): "
<< static_cast<wchar_t>(std::towupper(static_cast<std::wint_t>(L'ω'))) << std::endl;
return 0;
}
Content of out.txt
when compiled with Visual Studio 2019 (save source file with UTF-8 encoding and add compiler switch /utf-8
) and executed on Windows 10:
std::toupper(L'ω', std::locale::classic()): Ω
static_cast<wchar_t>(std::towupper(static_cast<std::wint_t>(L'ω'))): ω
Output with gcc version 8.4.0 (Ubuntu 8.4.0-1ubuntu1~18.04):
std::toupper(L'ω', std::locale::classic()): ω
static_cast<wchar_t>(std::towupper(static_cast<std::wint_t>(L'ω'))): ω