1

'1' use two bytes 20 and 0; '2' use two bytes 21 and 0;

'a' use two bytes 68 and 0; 'b' use two bytes 69 and 0;

I'm hook func ExtTextOutW from GDI32.dll(windowsapi) ,and read from str(unsigned char); such as abc123... while got:68 0 69 0 70 0 20 0 21 0...;

BOOL ExtTextOutW
 (
  HDC         hdc,
  INT         x,
  INT         y,
  UINT        flags,
  const RECT* lprect,
  LPCWSTR     str,
  UINT        count,
  const INT*  lpDx
 )
mouwang
  • 31
  • 4
  • 1
    Welcome to Stack Overflow. Please read [the help pages](http://stackoverflow.com/help), take [the SO tour](http://stackoverflow.com/tour), read about [how to ask good questions](http://stackoverflow.com/help/how-to-ask), as well as [this question checklist](https://codeblog.jonskeet.uk/2012/11/24/stack-overflow-question-checklist/). Lastly, what are you really asking about? What is your problem? What do you mean by "'1' use two bytes 20 and 0; '2' use two bytes 21 and 0;" etc.? – Some programmer dude Jan 22 '20 at 07:06
  • 1
    That's [UTF-16](https://en.wikipedia.org/wiki/UTF-16) little-endian, which is what Windows uses internally. And except for the extra 0s those examples are the same codepoints as ASCII. – Rup Jan 22 '20 at 07:06
  • I knew LPCWSTR;if it's UTF-16 encoding,charactor 'a' while encoding by tow bytes 0 and 97, but this use 68 and 0;so i think it's a incompatible ASCII; – mouwang Jan 22 '20 at 07:18
  • str(unsigned char) means i'm read it as unsigned char; because failed when I read str by UTF16\utf32\GBK . – mouwang Jan 22 '20 at 07:21
  • Oops, sorry, yes - at first glance they looked about right as hex. I guess it could be a font where the characters have been remapped to other code points, but that would be strange yes. – Rup Jan 22 '20 at 07:26
  • you are right.but I have never seen encoding like this; LPCWSTR sometimes use UTF16 encoding,sometimes use this,I really don't know How to Windows Lib identify it; Maybe I'm wrong; – mouwang Jan 22 '20 at 07:34
  • please clarify what your question is and show where these characters come from – M.M Jan 22 '20 at 08:51
  • I'm hook the function ExtTextOutW from GDI32.dll(windowsapi), and print paramter LPCWSTR with UTF8/UTF16/UTF32/GBK;sometimes success because LPCWSTR is encoding with utf16,sometimes failed because is not encoding with any of unicode And I found charactor 'a' encoding with two bytes 68 and 0,'b' is 69 and 0;question is what charSet LPCWSTR is when not use unicode; – mouwang Jan 22 '20 at 10:58

1 Answers1

2

when flags=0x10;

The lpString array refers to an array returned from GetCharacterPlacement and should be parsed directly by GDI as no further language-specific processing is required. Glyph indexing only applies to TrueType fonts, but the flag can be used for bitmap and vector fonts to indicate that no further language processing is necessary and GDI should process the string directly. Note that all glyph indexes are 16-bit values even though the string is assumed to be an array of 8-bit values for raster fonts. For ExtTextOutW, the glyph indexes are saved to a metafile. However, to display the correct characters the metafile must be played back using the same font. For ExtTextOutA, the glyph indexes are not saved.

mouwang
  • 31
  • 4