Windows uses some encoding table for non-unicode applications to map characters from unicode table to 1-byte table. There are many predefined character sets, user can choose one in windows settings. I need to create a custom character set. Where can I find some information about that process? I tried to Google it, but didn't have any luck, I guess, few people are doing that.
-
That sounds like an [XY Problem](http://xyproblem.info). What are you ultimately trying to accomplish? – IInspectable Jul 27 '18 at 10:41
-
There's old non-unicode application which must be adapted to new encoding (my country changes alphabet and applications should support new glyphs). Rewriting it with unicode is not possible. – vbezhenar Jul 27 '18 at 10:48
2 Answers
AFAIK, you can't do that, I don't think there's even a way to write some kernel mode "driver" for it, but, haven't looked into these things for a while, maybe there is some way (now).
In any case, you might be better off using a library you can change/update, such as libiconv.
UPDATE:
Since you don't have the source code, you're in a very unfortunate position.
For all string resources (in EXE or any DLLs or, though unlikely, in some other file(s)), you can "read them out" and figure out what's the code page used in them and change it (and the strings themselves), tweaking it in some way that would achieve your purpose - to have the right glyphs appear (yes, you might actually see different glyphs in Notepad, but, who cares if you application shows the right one(s) - FWIW, for such hacks, it's best to use a hex-editor). Then, of course, "put" the (changed) resources back in (EXE/DLL). But, it's quite possible not all strings are in resources, and that's when the "real" problems start.
There's any number of hacks that could have been done here. Your best option is to use some good debugger (WinDbg or better) and figure out what's going on and how are character sets handled = since you don't have the source code, it's gonna be quite painful. You want to find out:
- Are the default charset(s) used (OEM/ANSI), or some specific (via NLS APIs)?
- Whatever charset is used, is it a standard one or not? The charset here is the "code" Windows assigns to it. Look at Windows lists of available charsets.
- Is the application installing fonts? If it is, use a font tool to examine them - maybe it has a specific (non-standard?) code-page supported in it.
- Is the application installing some some drivers. If it is, the only way to gain more insight is to use a kernel debugger (which is very tricky and annoying, but, as already said, you're in an unfortunate situation).

- 2,468
- 16
- 24
-
It's certainly is possible in some way, because it's already done for old charset. But now when I'm thinking about it, it might be just tweaked system-wide font with some characters replaced... It won't play well with unicode applications of course, so proper custom system-wide character set would be better. Just converting is trivial, of course. – vbezhenar Jul 27 '18 at 10:51
-
@bezhenar if by "old charset" you mean the one that is supported by Windows, then of course it was done in Windows in _some_ way (by Microsoft), but, I don't know of a way to add a new charset _into_ Windows by someone else. – srdjan.veljkovic Jul 27 '18 at 11:21
-
No, it was not supported by Windows, it's created by third-party for non-standard charset. I don't know details, I'm reverse-engineering it now, may be they just replaced standard Arial fonts, but that's a cheap hack... – vbezhenar Jul 27 '18 at 11:24
-
-
No. Just installer and bunch of binary files and .cab-files (which couldn't be extracted by standard tools). – vbezhenar Jul 27 '18 at 11:57
-
Thanks for your answer. The situation worsens because application displays text from database (located on remote machine) and that text comes in that "unusual" encoding as well. So hacking application is very tricky. I think I've found the location where those windows charsets are located, that old installer is doing exactly that: installs new .nls file and modified registry value to point to that new file. It works just fine and I believe is a proper solution. So I'll change their nls file to add few new characters, I guess. – vbezhenar Jul 27 '18 at 13:17
It appears that those tables are located at C:\Windows\system32*.nls. I'm not sure whether there's proper documentation for their structure. There's some information in Russian here. Also you might want to tinker with registry at HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls

- 11,148
- 9
- 49
- 63