I need to process some Win-1251-encoded text (8-bit encoding, uses some of 128..255 for Cyrillic). As far as I can tell, C was created with 7-bit ASCII in mind, no explicit support for single-byte chars above 127. So I have several questions:
- Which is the more proper type for this text:
char[]
orunsigned char[]
? - If I use
unsigned char[]
with built-in functions (strlen
,strcmp
), the compiler warns about implicit casts tochar*
. Can such a cast break something? Should I re-implement some of the functions to supportunsigned char
strings explicitly?