2

In the Delphi Alexandria RTL, they have this function:

function ScanChar(const S: string; var Pos: Integer; Ch: Char): Boolean;
var
  C: Char;
begin
  if (Ch = ' ') and ScanBlanks(S, Pos) then
    Exit(True);
  Result := False;
  if Pos <= High(S) then
  begin
    C := S[Pos];
    if C = Ch then
      Result := True
    else if (Ch >= 'a') and (Ch <= 'z') and (C >= 'a') and (C <= 'z') then
      Result := Char(Word(C) xor $0020) = Char(Word(Ch) xor $0020)
    else if Ch.IsLetter and C.IsLetter then
      Result := ToUpper(C) = ToUpper(Ch);
    if Result then
      Inc(Pos);
  end;
end;

I can't understand the purpose of this comparison:

else if (Ch >= 'a') and (Ch <= 'z') and (C >= 'a') and (C <= 'z') then
  Result := Char(Word(C) xor $0020) = Char(Word(Ch) xor $0020)

It looks like it's the same as doing this:

else if (Ch >= 'a') and (Ch <= 'z') and (C >= 'a') and (C <= 'z') then
  Result := c = Ch

Is this true?

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
zeus
  • 12,173
  • 9
  • 63
  • 184
  • Won't the if condition never be true since to get past `if C = Ch then` to the next `else` they must differ in case if they are the same letter? – Brian Sep 13 '22 at 22:43

2 Answers2

3

else if (Ch >= 'a') and (Ch <= 'z') and (C >= 'a') and (C <= 'z') then Result := Char(Word(C) xor $0020) = Char(Word(Ch) xor $0020)

Purpose of this comparison is optimization and making faster comparison if the characters are plain ASCII letters and avoiding expensive call to WinAPI via ToUpper function that can handle Unicode characters.

Or at least that is what would happen if the comparison itself would not be badly broken.

Comparison checks whether both characters are lower case and fall into range between small letter a (ASCII value 97) and small letter z (ASCII value 122). But what it should actually check is that both characters fall into range between large letter A (ASCII value 65) and small letter z, covering the whole range of ASCII letters regardless of their case. (There are few non letter characters in that range, but those are not relevant as Result assignment would never yield True for any of those characters.)

Once that is fixed, we also need to fix Result assignment expression as it will not properly compare lowercase and uppercase letters. To do that we can simply use or operator on all characters which will turn uppercase characters to lowercase, and leave lowercase as-is. As previously mentioned, at this point in code, non-letter characters in that range can be safely ignored.

Correct code for that part of the ScanChar function would be:

...
else
if (Ch >= 'A') and (Ch <= 'z') and (C >= 'A') and (C <= 'z') then
  Result := Word(Ch) or $0020 = Word(C) or $0020
else
...

Note: Even through original ScanChar function contains incorrect code, the result of the function will still be correct as for same letters in different case the code will always go through ToUpper part of the if branch.

Dalija Prasnikar
  • 27,212
  • 44
  • 82
  • 159
  • 2
    Wouldn't it be simpler to just AND $FFDF and get rid of both lower case checks? `Result := (Word(Ch) and $FFDF) = (Word(C) and $FFDF)` – Uwe Raabe Sep 13 '22 at 21:32
  • @UweRaabe Yes, it would. Stefan mentioned it yesterday, I just didn't get around to update the answer. – Dalija Prasnikar Sep 14 '22 at 07:23
2

It is not exactly the same as C = Ch, but the result is the same, I suppose.

The comparison is redundant, IMHO. It is using XOR to convert lowercase ASCII letters into uppercase ASCII letters (as they differ by only 1 bit), and then comparing the uppercase letters for equality. But the following comparison using IsLetter+ToUpper does the same thing, just for any letters, not just ASCII letters.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • doing Char(Word(C) xor $0020) = Char(Word(Ch) xor $0020) work only if BOTH c and ch are in lowercase. so is C = 'A' and ch = 'B' it's will not work – zeus Sep 13 '22 at 08:50
  • @zeus look at the code more closely. It IS ensuring both characters are lowercase before checking the XOR result. And FYI, the same XOR works if both characters are uppercase, converting them to lowercase. – Remy Lebeau Sep 13 '22 at 08:55
  • yes so it's completely useless because just before we did if C = Ch then Result := True – zeus Sep 13 '22 at 10:32