0

I'm try to port some code from D2007 to DXE2. This simplified code compiles fine in D2007. In DXE2 it show this error:

[DCC Warning] Unit1.pas(10): W1050 WideChar reduced to byte char in set expressions.  Consider using 'CharInSet' function in 'SysUtils' unit.
[DCC Error] Unit1.pas(37): E2010 Incompatible types: 'AnsiChar' and 'Char'

Propably a unicode issue. Can someone tell me why this happen and how I should correct it ?

Regards

The code:

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs;

type
  TSetOfChar = Set of Char;  // Line 10

  TForm1 = class(TForm)
    procedure FormCreate(Sender: TObject);
  private
    FCharacterSet: TSetOfChar;
  public
    property CharacterSet: TSetOfChar read FCharacterSet write FCharacterSet;
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

procedure TForm1.FormCreate(Sender: TObject);
var
  CharacterSet: TSetOfChar;
  j: Integer;
  s: String;
begin
  CharacterSet := [];
  s := 'I''m just testing åäö';

  for j := 1 to Length(s) do
    Include(CharacterSet, s[j]);  // <- Line 37

end;

end.

EDIT: Note that I am using Delphi 2007 that have no generics. I want code that still works in D2007 because there is a lot of code to port to Unicode. This is slow process. When everything is ported, verified it works with XE2 then we can use XE2 things like generics. In the meantime we maintain the D2007 as usual and we want to avoid making a XE2 branch in the revision control system.

Roland Bengtsson
  • 5,058
  • 9
  • 58
  • 99
  • I think a Dictionary type might offer comparable runtime speed to a Delphi set type's "if x in set" operations, consider using a dictionary instead. Include would become "if not already part of the dictionary, then add this key" operation on a dictionary with Widechar keys. – Warren P Sep 30 '12 at 00:43

1 Answers1

8

This is standard Unicode Delphi migration fodder. Required reading is Marco Cantù's paper White Paper: Delphi and Unicode. If you haven't read that, do so. If you haven't read it recently, do so again.

The reason that set of char produces a warning is that the base type for sets cannot have more than 256 values. But since char is now UTF-16, that's a lot more than 256. All this means that your code can never work with sets and UTF-16 characters.

You could use set of AnsiChar and AnsiString. But if you want this code to work on Unicode data then you'll need to use something other than a set. For example TList<char> could be used.

var
  CharacterSet: TList<char>;
  s: string;
  c: char;
.....
CharacterSet := TList<char>.Create;
s := 'I''m just testing åäö';
for c in s do
  if not CharacterSet.Contains(c) then
    CharacterSet.Add(c);

I would not recommend that for production. Its performance characteristics will be terrible. A hash based dictionary would do better. Best of all would be a dedicated large set class.

One final point. Characters are not the same as code points in UTF-16 which is a variable length encoding. The code in question and this answer make no allowance for that.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • +many for the `Characters are not the same as code points in UTF-16 which is a variable length encoding.` – Jeroen Wiert Pluimers Sep 30 '12 at 08:31
  • Note that I am using Delphi 2007 that do not have generics. Maybe a TStringlist.Value is the best option here ? – Roland Bengtsson Oct 01 '12 at 05:51
  • If you are using D2007 then you don't have Unicode and there's no question. The question only makes sense with Unicode Delphi. – David Heffernan Oct 01 '12 at 06:12
  • Yes I agree :) But the thing is that I want code that works both in Delphi 2007 and Delphi XE2. Porting to XE2 is a slow process that is not done in a week. So all code must be working in Delphi 2007 until we have deployed everything in live environment. – Roland Bengtsson Oct 06 '12 at 15:08
  • You cannot use `set of char` in XE2. I think I answered the question that you asked. You can't expect me to advice you on how to design your porting strategy. I don't have enough information and context to do that. – David Heffernan Oct 06 '12 at 15:19