1

I'm creating the german version of a chm help file. My problem is in Table of Contents umlauts are not displayed. I assume it is because of code page. The hhc file is ANSI. Converting it to Unicode doesn't help - it displays different, but still wrong, characters.

The file "Table of Contents.hhc" starts with

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<HTML>
<HEAD>
<meta name="GENERATOR" content="Microsoft&reg; HTML Help Workshop 4.1">
<!-- Sitemap 1.0 -->
</HEAD><BODY>
<OBJECT type="text/site properties">
    <param name="ImageType" value="Folder">
</OBJECT>
<UL>
    <LI> <OBJECT type="text/sitemap">
        <param name="Name" value="ÜÜÜÜÜÜÜÜÜÜÜÜÜÜ Uberblick">
        <param name="Local" value="overview.htm">
        <param name="URL" value="overview.htm">
        </OBJECT>
</UL>
</BODY></HTML>
sashoalm
  • 75,001
  • 122
  • 434
  • 781

3 Answers3

2

Make sure the "Language" setting in the "Options" section of the project file supports the character you want. Since you are on a Russian system, the default is probably Russian. Change it to German, for instance. The engine rendering the chm is Unicode, only the compiler is ansi.

Mihai Nita
  • 5,547
  • 27
  • 27
0

Actually, you don't need UTF-8 for CHM files because CHM doesn't support UTF-8 or Unicode. CHM is an ancient format that Microsoft has not really changed since Windows 98, and it has a number of quirks and restrictions like this

Read for more detail...

https://helpman.it-authoring.com/viewtopic.php?t=9294

https://blogs.msdn.microsoft.com/sandcastle/2007/09/29/chm-localization-and-unicode-issues-dbcsfix-exe/

Kashif Meo
  • 65
  • 1
  • 8
0

Try escaping them? http://www.w3schools.com/tags/ref_entities.asp

or the charset encoding:http://www.w3.org/TR/html4/charset.html#h-5.2.2

Marco van de Voort
  • 25,628
  • 5
  • 56
  • 89
  • Thanks for the suggestion. I was sure that it would work but when I tried it, it didn't have any effect. Maybe it can't be done, from what I saw on the Internet hh.exe is an ANSI application, so maybe it can't display unicode characters, even if they're correctly encoded. There are no problems in the help pages themselves, but those are displayed by IE (embedded as a COM/OLE/whatever control). – sashoalm Dec 01 '11 at 09:41
  • The binary indexes of CHM are stored as 2-byte strings. Enabling them might also help. Which of both suggestions did you try? – Marco van de Voort Dec 01 '11 at 12:00
  • Read any good CHM documentation or help. Short version: There are two ways of TOC and index storage. The helpsystem can get the .hh* files from the CHM, or use binary indexes in the .CHM. To generate them, you need to set certain keywords in the .hhp. – Marco van de Voort Dec 01 '11 at 13:09