1

I read an UTF8-File, made with Winword, into a Tmemo, using the code below (tried all 2 methods). The file contains IPA pronunciation characters. For these characters, I see only squares. I tried different versions of tmemo.font.charset, but it did not help.

What can I do?

Peter

// OD is an TOpenDialog

procedure TForm1.Load1Click(Sender: TObject);

{
var fileH: textFile;
    newLine: RawByteString;

begin
   if od.execute (self.Handle) then begin
      assignFile(fileH,od.filename);
      reset(fileH);
      while not eof(fileH) do begin
        readln(fileH,newLine);
        Memo1.lines.Add(UTF8toString(newLine));
      end;
      closeFile(fileH);
   end;
end;
}


var
  FileStream: tFileStream;
  Preamble: TBytes;
  memStream: TMemoryStream;
begin
  if od.Execute then
  begin
    FileStream := TFileStream.Create(od.FileName,fmOpenRead or fmShareDenyWrite);
    MemStream := TMemoryStream.Create;

    Preamble := TEncoding.UTF8.GetPreamble;
    memStream.Write(Preamble[0],length(Preamble));
    memStream.CopyFrom(FileStream,FileStream.Size);
    memStream.Seek(0,soFromBeginning);

    memo1.Lines.LoadFromStream(memStream);

    showmessage(SysErrorMessage(GetLastError));

    FileStream.Free;
    memStream.Free;
  end;
end;
Michael
  • 41,989
  • 11
  • 82
  • 128
Peter Graf
  • 23
  • 1
  • 7
  • Are you sure the font you use contains those characters? – FileVoyager Sep 04 '14 at 15:16
  • on http://ipa.typeit.org/ they recommand the following fonts: Segoe UI, Cambria, Calibri, Arial, Times New Roman, Tahoma or Lucida Sans Unicode (incomplete) – FileVoyager Sep 04 '14 at 15:18
  • By "Winword", I presume you mean "Word for Windows" (more commonly just referred to as "Word". Word does not create text files unless you specifically tell it to do so using "Save As" and changing the file type, so it's highly likely that the squares you are seeing are non-text characters. Have you checked the file in something like Notepad to see if it's readable there? – Ken White Sep 04 '14 at 15:26

2 Answers2

5

First, you are doing too much work. Your code can be simplified to this:

procedure TForm1.Load1Click(Sender: TObject);
begin
  if od.Execute then
    memo1.Lines.LoadFromFile(od.FileName, TEncoding.UTF8);
end;

Second, as David said, you need to use a font that supports the Unicode characters/glyphs that are stored in the file. It is not enough to set the Font.Charset, you have to set the Font.Name to a compatible font. Look at the fonts that loursonwinny mentioned.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • This won't work in Delphi 2007. Any workaround to do the same thing on D2007, please ? – delphirules Oct 28 '19 at 13:38
  • 1
    @delphirules since D2007 is a pre-Unicode version of Delphi, you will need to find a 3rd party Unicode-aware Memo control, such as `TWideMemo` from the [TNT Controls](https://github.com/rofl0r/TntUnicode) – Remy Lebeau Oct 28 '19 at 15:14
  • Thanks , tried to install it but got this error : "Do not refer to TntWideStrings.pas. It works correctly in Delphi 2006". Is it compatible with D2007 ? – delphirules Oct 28 '19 at 16:20
  • @delphirules looking at TNT's source, the error is because the `TntWideStrings.pas` unit is not meant to be used in D2006 and later. It defines classes, like `TWideStrings`, which the RTL introduced in D2005 (but weren't fully ready for use until D2006). – Remy Lebeau Oct 28 '19 at 19:36
  • Should i simply delete these files then ? – delphirules Oct 29 '19 at 10:54
  • @delphirules I have no idea what the proper install steps are. I have never used TNT myself. – Remy Lebeau Oct 29 '19 at 15:32
  • Ok, will check. Thanks for answering ! :D – delphirules Oct 29 '19 at 17:15
1

For these characters, I see only squares.

The squares indicate that the font does not contain glyphs for those characters. You'll need to switch to a font that does. Assuming that your file has been properly encoded and that you are reading in the code points that you intend to.

You can pass TEncoding.UTF8 to the LoadFromFile method to avoid having to add a BOM to the content. Finally, don't call GetLastError unless the Win32 documentation says it has meaning. Where you call it, there is no reason to believe that the value has any meaning.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490