5

How do I use CharInSet() to get rid of this warning?

lastop:=c;
{OK - if op is * or / and prev op was + or -, put the whole thing in parens}
If (c in ['*','/']) and (NextToLastOp in ['+','-']) then
  Result.text:='('+Result.text+')';
Result.text:=Result.text+c;

[dcc32 Warning] U_Calc1.pas(114): W1050 WideChar reduced to byte char in set expressions. Consider using 'CharInSet' function in 'SysUtils' unit.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
Raj
  • 69
  • 1
  • 2
  • 5

1 Answers1

15

CharInSet is necessary due to the fact that WideChar values are too large to be used in actual set operations.

To use it in your code you would use it like this:

If (CharInSet(c, ['*','/']) and (CharInSet(NextToLastOp, ['+','-']) then

However CharInSet doesn't actually do anything special or even useful other than silence the warning.

The code in the function is identical to the code it replaces. The warning is silenced only because the RTL units are pre-compiled. The compiler does not recompile them even if you do a full "Rebuild all". i.e. that code in SysUtils is not actually ever re-compiled. If it were it would emit the exact same warning.

So, you could use that function to avoid the warning but if you can be certain that your character data is ASCII (or extended ASCII), that is with ordinal values in the range #0 to #255, then you could also avoid the warning by typecasting your character variable explicitly to an ANSIChar:

If (ANSIChar(c) in ['*','/']) and (ANSIChar(NextToLastOp) in ['+','-']) then

To reiterate: You should only do this if you are absolutely certain that the character data that you are working with consists only of ASCII characters.

But this same caveat applies equally to using CharInSet since this still requires that you test for membership in an ANSIChar set (i.e. single byte characters).

Explicit type-casting has the advantage (imho) of making it explicit and apparent that your code is intended deliberately to expect and work only with ASCII characters, rather than giving a perhaps misleading impression of being more fully equipped to handle Unicode.

Deltics
  • 22,162
  • 2
  • 42
  • 70
  • 2
    The cleanest way to handle such a test is to write `(c='*') or (c='/')` – David Heffernan Apr 12 '17 at 06:48
  • procedure TForm3.Btn0Click(Sender: TObject); {handles all digits} begin {NOTE! 4th charatcer of button anme is the digit - do NOT rename buttons!} If sender is TButton then If TButton(Sender).name[4] in ['0'..'9'] then AddDigit(TButton(sender).name[4]) else beep; end;[dcc32 Warning] Unit3.pas(119): W1050 WideChar reduced to byte char in set expressions. Consider using 'CharInSet' function in 'SysUtils' unit. – Raj Apr 12 '17 at 07:20
  • @Raj: This is the very same problem, isn't it? – René Hoffmann Apr 12 '17 at 08:06
  • @Raj Using component names like that is very brittle, there is almost always a better way. – David Heffernan Apr 12 '17 at 08:11
  • Guide me wirh a example. – Raj Apr 12 '17 at 08:14
  • AddDigit(TButton(sender).name[4]) . How to use charinset for this syntax. Iam trying but not getting the right way to implement this. – Raj Apr 12 '17 at 08:21
  • @Deltics But you end up with code that works, unlike your code which fails with ordinals > 255. And it is trivially easy to wrap these comparisons up in a way to make the calling code less noisy. For instance by passing an open array parameter of characters to a function that iterates over them performing the comparison. – David Heffernan Apr 12 '17 at 09:06
  • 1
    I think it is cleaner. Surely you know how to extract code into a method? I know you do. Anyway, your answer is factually wrong. Consider this program: `{$APPTYPE CONSOLE} uses System.SysUtils; const c = #7796; begin Writeln(CharInSet(c, ['t'])); Writeln(AnsiChar(c) in ['t']); end. ` The output is `FALSE TRUE` – David Heffernan Apr 12 '17 at 09:19
  • Further, the fact that I did not answer the question does not render my comments invalid. It is commonplace here to comment without answering. I'm sure you know that. You yourself do it. – David Heffernan Apr 12 '17 at 09:21
  • @Raj Hard to give an example of how to do it better, without knowing more details of what the problem is. But I don't think I've ever seen a problem that has best been solved using component names. The problem is that it is all too easy to change the names, and not find all the code that depends upon those names. – David Heffernan Apr 12 '17 at 09:25
  • @Deltics Nope. You stated that `CharInSet(c, [...])` is identical to `AnsiChar(c) in [...]`. That statement is wrong, as the program in my comment demonstrates. – David Heffernan Apr 12 '17 at 09:26
  • *The difference is that CharInSet might give the impression of correctly supporting a test for WideChar encoded values in an ANSIChar set when it does not.* Again, that statement is wrong. – David Heffernan Apr 12 '17 at 09:28
  • *You should only do this if you are certain that the character data that you are working with is certain to consist of ASCII characters. But this same caveat applies equally to using CharInSet.* Also wrong. – David Heffernan Apr 12 '17 at 09:29
  • Look closely at the program I provided in the comment. Look at the output. Explain to me why the two tests evaluate to different values when you claim they should always evaluate to the same value. Either both false or both true. – David Heffernan Apr 12 '17 at 09:30
  • 5
    @David - you know what, I'm done playing your game. Maybe CharInset works differently, or maybe they fixed it in the version you are working with. Either way, how about you write up a better answer or use your moderator power to edit and improve this one instead of snarking around in the comments ? – Deltics Apr 12 '17 at 09:36
  • Look at what the compiler does with `C in CharSet` in the implementation of `CharInSet`. It emits code like this: `Project86.dpr.10: Result := C in CharSet; 004051B1 8B45F8 mov eax,[ebp-$08] 004051B4 668B55FE mov dx,[ebp-$02] 004051B8 6681FAFF00 cmp dx,$00ff 004051BD 7709 jnbe $004051c8 004051BF 81E2FF000000 and edx,$000000ff 004051C5 0FA310 bt [eax],edx 004051C8 0F92C0 setb al 004051CB 8845F7 mov [ebp-$09],al` Look at that `cmp dx,$00ff`. You don't get that with your cast to `AnsiChar`. – David Heffernan Apr 12 '17 at 09:36
  • 2
    I'm not playing a game for heaven's sake. You made a mistake. I pointed it out. Try to get past your pride! You accuse me of "always being right" and being "drunk". Well, I think perhaps you need to look harder in the mirror. We all make mistakes. So what. If you can't admit it when you made a mistake, what are you? – David Heffernan Apr 12 '17 at 09:37
  • Now, knowing what I know now about `CharInSet`, I think it has value. If the sets elements are all ASCII then it is the cleanest way to write this code. So, I take back my first comment. I'd only stand by that if you needed to work with sets containing non ASCII characters. – David Heffernan Apr 12 '17 at 10:15
  • 1
    This is the place to Help those who are in need. Not to play PRIDE AND PRIJIDUCE.No one is forcing anyone to help or guide. Its your choice man. So be free and think free. – Raj Apr 12 '17 at 10:37
  • 3
    What made you think anybody was being forced to do anything. I pointed out the errors in this answer. Do what you will with that information. – David Heffernan Apr 12 '17 at 10:40