1

We have an application in delphi 2007 which writes contents of memo to a text file and loads it back. TMemo charset is set to AnisCharSet and Enable runtime themes in project options --> application is turned off. The content that is loaded is something like this "möglich" which contains umlaut characters. This application works fine in delphi 2007 which writes to text file and reads it back and show the content correctly. While writing to file memo converts the umlaut string to like this "mцchten" and when memo loads it back it correctly displays it.

Now we have the upgraded the same application to delphi xe2 and in here when the memo loads the text from file it doesn't display it properly i.e. instead of displaying umlaut string as "möglich", it displays "mцchten", the conversion which was happening in delphi 2007 doesn't seem occur when displaying umlaut characters in delphi xe2.

Can some one suggest what can be done to display umlaut characters properly in delphi xe2 application as well. It's just memo's savetofile and loadfromfile options are used without any encoding. In Xe2 enable runtime themes is turned off and memo's font charset is still Ansi.

The problem here is loading and showing the characters from the text file in delphi xe2 application which was saved using delphi 2007 application

TLama
  • 75,147
  • 17
  • 214
  • 392
SK9
  • 87
  • 11
  • I had similiar problem when migrating application from D7 to XE2. In my case problem was that during dfm conversion controls were given some exotic font. Maybe this is case here. – Michał Szczygieł Jan 31 '13 at 12:41
  • Have you checked the file? Is it properly encoded in the system default ANSI code page? You can add a TEncoding parameter to the SaveToFile and LoadFromFile methods in XE2. – Jens Mühlenhoff Jan 31 '13 at 15:53
  • @JensMühlenhoff: To maintain backwards compatibility with earlier versions, `TStrings` defaults to `TEncoding.Default` when you don't specify a `TEncoding` of your own. Earlier versions osed the OS default Ansi codepage to read/write data. `TEncoding.Default` does the same. – Remy Lebeau Jan 31 '13 at 18:14
  • @RemyLebeau: I know that it should work out of the box, I just wanted to make sure the file was written correctly by the D2007 application. – Jens Mühlenhoff Feb 01 '13 at 16:36
  • @RemyLebeau: Doesn't `TStrings.LoadFromFile` try to detect the encoding using `TEncoding` when you don't specify it explicitly? – Jens Mühlenhoff Feb 01 '13 at 16:39
  • @JensMühlenhoff: yes, it does, but only UTF-8 and UTF-16 are detectable since they have BOMs defined, and it falls back to `TEncoding.Default` if it does not find a matching BOM at the front of the file. – Remy Lebeau Feb 01 '13 at 17:26

1 Answers1

4

Umlauts are codepage-sensitive in Ansi environments, like D2007, and thus require extra handling in Unicode environments, like XE2. As such, it is best to read/write your data using a codepage-neutral format instead, like UTF-8. Especially if the data has to move from one PC to another, where the OS default Ansi codepage may be different.

In this particular situation, you may have to find out what specific codepage was being used by the OS when the D2007 code saved the data (since it would have used the OS default Ansi codepage), and then explicitally tell the XE2 code to use that same codepage when reading the data. In XE2, the TStrings.LoadFrom...() methods have an optional TEncoding parameter You can use the TEncoding.GetEncoding() method to get a TEncoding object for a specific codepage.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770