4

I found some strange behaviour when using DrawTextA in combination with the Courier New-font with a Japanese locale. Consider the following Delphi XE2 code:

procedure PaintTexts(aPaintBox: TPaintBox; aCharset: Byte);
var
  A: AnsiString;
  S: string;
  R: TRect;
begin
  aPaintBox.Font.Charset := aCharset;

  A := '[DrawTextA] The word "Japan" in Japanese: 日本';
  R := Rect(0, 0, aPaintBox.Width, aPaintBox.Height);
  DrawTextA(aPaintBox.Canvas.Handle, PAnsiChar(A), Length(A), R, 0);

  S := '[DrawTextW] The word "Japan" in Japanese: 日本';
  R := Rect(0, 20, aPaintBox.Width, aPaintBox.Height);
  DrawTextW(aPaintBox.Canvas.Handle, PWideChar(S), Length(S), R, 0);
end;

procedure TForm1.PaintBox1Paint(Sender: TObject);
begin
  PaintTexts(PaintBox1, DEFAULT_CHARSET);
end;

procedure TForm1.PaintBox2Paint(Sender: TObject);
begin
  PaintTexts(PaintBox2, SHIFTJIS_CHARSET);
end;

In this code, Form1 contains two paintboxes (PaintBox1 and PaintBox2). The font of Form1 is set to Courier New, the two paintboxes have set ParentFont to True. The non-Unicode locale of Windows is set to Japanese (Japan), so it is working with codepage 932.

This what the result looks like:

Screenshot of the output

The first paintbox shows the output of a DrawTextA and a DrawTextW call with a Charset property CHARSET_DEFAULT. This is the default value of the font's charset property. Note that the japanese word 日本 is not shown correctly when passed to DrawTextA. However, DrawTextW draws it perfectly.

The second paintbox shows the same texts, but only with the Charset property changed to SHIFTJIS_CHARSET. Now both calls show the correct japanese characters. But the font has changed to a variable width font!

When I change the font of Form1 to Tahoma, both DrawTextA and DrawTextW show the same correct texts.

Does anyone know why DrawTextA behaves different than DrawTextW when my non-Unicode locale is set to Japanese and my font is set to Courier New?

I always thought that the only difference between the Ansi- and Wide-versions of Windows API's was that the Ansi-versions handled the conversion to and from Unicode.

I have tried this in combination with Windows XP and Windows 7, and Delphi 7 and Delphi XE2. All combinations show the same behaviour.

Update: After David Heffernan posted his answer, I started reading Micheal Kaplan's blog. There I found a similar topic and also more information about this topic.

R. Beiboer
  • 712
  • 9
  • 21
  • Does Courier New font have chars for `SHIFTJIS_CHARSET` charset ? – TLama Jun 11 '14 at 13:00
  • According to Character Map, it does not. – R. Beiboer Jun 11 '14 at 13:03
  • I noticed this behaviour in a TLabel in a Delphi 7 application. A different application written in Delphi XE2 did not show this problem, although it was using the same code constructs. After some investigation I found that the only difference was a call to DrawTextW (Delphi XE2) instead of DrawTextA (Delphi 7). For the sake of simplicity I reduced the problem to the example I use in the question. – R. Beiboer Jun 11 '14 at 13:20
  • The debugger shows that the two japanese characters are encoded by 4 bytes: #$93, #$FA, #$96, #$7B. As you can see on [msdn](http://msdn.microsoft.com/en-us/goglobal/cc305152), these four bytes encode the two japanese characters (日本). – R. Beiboer Jun 11 '14 at 13:45

1 Answers1

1

DrawTextA does not convert the text to Unicode. Instead the selected font's charset is used to interpret the supplied text. This is indeed somewhat more complex than the typical A and W suffixed API functions.

The use of the font charset allowed non-Unicode programs to display text in multiple character sets. For a Unicode program this is a complete non-issue because Unicode can encode all characters.

According to Michael Kaplan in this forum thread, DEFAULT_FONTSET should not be used. He says:

Do not use DEFAULT_CHARSET at all. It is evil.

If you need to specify a charset, you should do the following:

  1. Call GetACP to obtain the active code page.
  2. Call TranslateCharsetInfo passing the code page and specify the TCI_SRCCODEPAGE flag.

The charset info information that is returned is the appropriate charset to use for the active code page. Wrap it up like this:

function CharsetFromCP(CP: UINT): UINT;
var
  csi: TCharsetInfo;
begin
  Win32Check(TranslateCharsetInfo(CP, csi, TCI_SRCCODEPAGE));
  Result := csi.ciCharset;
end;

And then you can write:

aPaintBox.Font.Charset := CharsetFromCP(GetACP);

Of course, if you know the text is Japanese then you can write SHIFTJIS_CHARSET directly. And even more obviously, you can simply use the Unicode API and avoid all this nonsense.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • Thank you for your comment, the information is really useful. But actually, it does not realy answer my question. I always thought that the A-versions of Windows API only did some ANSI-Unicode (ande vise versa) conversions. Given that, I would think that DrawTextA simply converts the string to Unicode, and then calls DrawTextW. The project I am working on is a Delphi 7 application and I have to add support for different languages (like Japanese, Chinese, Russian). I cannot simply port it to a Delphi XE2 application, unfortunately. – R. Beiboer Jun 11 '14 at 14:36
  • That's patently not the case though is it? You can see that from your program. The `Charset` property of the font determines how the text is emitted. I'll add some text to the answer to draw that out. – David Heffernan Jun 11 '14 at 14:43
  • I perfectly see that the Charset property determines how the text is rendered when using DrawTextA. But when I use DrawTextW, it always renders the correct japanese characters, regardless of the Charset-propery. It does influence the chosen font though (it substitutes the fixed-width font with a non-fixed width font). – R. Beiboer Jun 11 '14 at 14:49
  • When you call `DrawTextW`, you are passing Unicode characters. When you call `DrawTextA` you need to specify the encoding. Essentially the charset serves that purpose. Without a charset then there would be no way to draw text using characters outside the active code page. Consider how you would use `DrawTextA` to draw Japanese and Chinese text in the same program. – David Heffernan Jun 11 '14 at 14:52
  • Well, now we know WHAT the behaviour is, but not WHY. But then again, maybe that is not so important after all. It would be nice though, that this difference in behaviour had been documented on msdn. Maybe it is, but I have not found it yet. I always thought A-API's just called W-API's after converting strings to Unicode. But for this particular function, it does not. Thank you for your help @David ! – R. Beiboer Jun 11 '14 at 14:58
  • 1
    Again, suppose that there was no charset, and that `DrawTextA` simply called `MultiByteToWideChar` using the active code page, and passed that on to `DrawTextW`. If that was how it was, how would you use `DrawTextA` to draw Japanese and Chinese text in the same program? That's the why. – David Heffernan Jun 11 '14 at 15:00
  • Just one more thing. Isn't it true that all A-API's expect the encoding of the string to be CP_ACP? So why do I have to specify the encoding when I call DrawTextA? – R. Beiboer Jun 11 '14 at 15:01
  • I guess you're right @David. This must have been the way to draw other chararacters than those in the current code page, back when Windows did not support Unicode. – R. Beiboer Jun 11 '14 at 15:06
  • DrawTextA (or rather DrawTextExA which it calls) calls DrawTextExW after calling GdiGetCodePage in gdi32. According to some Wine [report](http://www.winehq.org/pipermail/wine-patches/2007-March/037485.html) this gdi call is responsible for the correct unicode translation. – Sertac Akyuz Jun 11 '14 at 15:27