7

The following function uses int as the second argument type,

memchr(const void *buf, int ch, size_t count);

Though it is used for a character type. Why is the function defined to use int for the argument of char type? Are there any special reasons for this?

Peter O.
  • 32,158
  • 14
  • 82
  • 96
misteryes
  • 2,167
  • 4
  • 32
  • 58
  • 2
    You might also want to read this which is basically the same question. http://stackoverflow.com/questions/5919735/why-does-memset-take-an-int-instead-of-a-char – tangrs Apr 03 '13 at 21:45

4 Answers4

8

It is so because this is a very "old" standard function, which existed from the very early times of C language evolution.

Old versions of C did not have such things as function prototypes. Functions were either left undeclared, or declared with "unknown" parameter list, e.g.

void *memchr(); /* non-prototype declaration */

When calling such functions, all argument were subjected to automatic argument promotions, which means that such functions never received argument values of type char or short. Such arguments were always implicitly promoted by the caller to type int and the function itself actually received an int. (This is still true in modern C for functions declared as shown above, i.e. without prototype.)

When eventually C language developed to the point where prototype function declarations were introduced, it was important to align the new declarations with legacy behavior of standard functions and with already compiled legacy libraries.

This is the reason why you will never see such types as char or short in argument lists of legacy function declarations. For the very same reason you won't see type float used there either.


This also means that if for some reason you have to provide a prototype declaration for some existing legacy function defined in K&R style, you have to remember to specify the promoted parameter types in the prototype. E.g. for the function defined as

int some_KandR_function(a, b, c)
char a;
short b;
float c;
{
}

the proper prototype prototype declaration is actually

int some_KandR_function(int a, int b, double c);

but not

int some_KandR_function(char a, short b, float c); // <- Incorrect!
AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
1

All of the standard functions that deal in characters do. I think the reason is partly historical (in some pre-standard versions of C, a function couldn't take a char or unsigned char argument, just like varargs arguments can't have character type) and partly for consistency across all such functions.

There are a few character-handling functions that have to use int in order to allow for the possibility of EOF, but memchr isn't one of them.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • Did you really mean "EOL"? – Keith Thompson Apr 03 '13 at 21:47
  • 1
    No, it's `EOF`, a (necessarily negative, commonly -1) value which in-band signals an end of file. That's why e.g. `fgetc` returns `int`, or `isalnum` takes one: both work with `unsigned char` and `EOF`, the latter not being representable as an `unsigned char`. – Masklinn Apr 26 '22 at 14:00
0

Because char, signed char, and unsigned char are three distinct types. On a specific implementation you don't know whether char is signed or unsigned.

Except in rare systems (see Keith's note), the type int comprises all the values those three types can have.

pmg
  • 106,608
  • 13
  • 126
  • 198
  • On most systems. If `sizeof (int) == 1` (implying `CHAR_BIT >= 16`), then `int` may not be able to hold all values of type `unsigned char`. Such systems are rare. – Keith Thompson Apr 03 '13 at 21:46
  • @pmg That's why I completed "char, signed char and unsigned char" with EOF. –  Apr 03 '13 at 21:48
0

C never passes arguments smaller than an int to a function. Defining a function argument as char or short is always promoted to an int and defining an unsigned char or unsigned short argument is always promoted to an unsigned int.
If the unsigned vs signed char were the issue, they would have used a short (I don't recall a single platform where a short is 1 byte long).
And of course using an int handles the unsigned vs signed char possibility as memchr is only comparing equality.

Marcelo Pacheco
  • 152
  • 1
  • 5
  • That is not true. If you open godbolt and change the type of the default function (square) from `int` to `char`, you can see that the assembly changes from copying a DWORD to a BYTE, following which there are two "move with sign extend" in order to fill extended registries from bytes, before performing the `imul`. Same story with `-O`, the `int` version self-multiplies `edi` as-is, while the `char` version performs a sign-extended copy of `dil` (the low 8 bits of `DI`) to `EAX`, which it self-multiplies. – Masklinn Apr 26 '22 at 13:55