-2

So I have to recreate the memcmp() function using C and my function works as expected. It returns the difference of the first character that does not match in both strings.

My function:

int ft_memcmp(const void *s1, const void *s2, size_t n)
{
    unsigned char   *ptr1;
    unsigned char   *ptr2;

    ptr1 = (unsigned char *)s1;
    ptr2 = (unsigned char *)s2;
    while (n && (*ptr1 == *ptr2))
    {
        ptr1++;
        ptr2++;
        n--;
    }
    if (n == 0)
        return (0);
    else
        return (*ptr1 - *ptr2);
}

My main:

int main(void)
{
    const char  *s1 = "acc";
    const char  *s2 = "abc";
    int         res;

    res = memcmp(s1, s2, 3); 
    printf("%i\n", res);

    return (0);
}

So this main will return 256, but if you use my function (ft_memcmp) you get 1. Obviously the difference is 1 and not 256, but why does the original function return 256? With a difference of 2, I get 512...

S.S. Anne
  • 15,171
  • 8
  • 38
  • 76
Kbam7
  • 324
  • 2
  • 15
  • 1
    Note: `return` is a statement, not a function. The parenthesis you use are part of the expression, not the statement. They are not required for the statement. Their use is not recommended for simple expressions, as they complicate reading. – too honest for this site May 12 '16 at 15:35
  • Okay, fair comment. I totally agree. However, my college uses something called the "Norm" and it requires us to put parenthesis when using 'return' as well as spaces between keywords and parenthesis. For example, 'while ()' and not 'while()'.... – Kbam7 May 12 '16 at 15:38
  • 1
    @Olaf it is your personal opinion. Let us focus here on technical aspects of the question even though really very rarely you will see return (0), I do agree it should read return 0, obviously. – 4pie0 May 12 '16 at 15:48
  • `*ptr1 - *ptr2` will provide the wrong sign answer on platforms that use an `unsigned char` and `unsigned` of the same size (e.g. some graphics processors) due to mathematical overflow. `(*ptr1 > *ptr2) - (*ptr1 < *ptr2)` is an idiomatic alternative.. – chux - Reinstate Monica May 12 '16 at 16:03
  • @chux - So would you recommend I use square bracket notation instead? EG. `ptr1[i] - ptr2[j]` Would that be a viable alternative as well? – Kbam7 May 12 '16 at 16:10
  • 1
    `*ptr1 - *ptr2` and (keeping pointers constant, but incrementing an index) `ptr1[some_int] - ptr2[some_int]` are the same. So do not see that as an improvement over `(*ptr1 > *ptr2) - (*ptr1 < *ptr2)`. – chux - Reinstate Monica May 12 '16 at 16:23

2 Answers2

4

Why does memcmp() return 256 for a difference of 1?

First, as answered by @where_is_tftp, the only thing memcmp() needs to return is 0, some positive number and some negative number on a compare.

The 256 instead of 1 - certainly because of optimization.

A good memcmp() takes advantage of doing its compare using types wider than char as it can.

Example: After considering alignment and the overall length, the first compare (using 32-bit unsigned) sees a difference, not in the 1's position (bit 0), but the 256's postilion (bit 8). Since returning 256 is just as valid as 1, no need to simplify. Remember that memcmp() is platform specific and its implementation can do the things that C code cannot - like safely access outside the array. Other details omitted here.

    byte 3         2        1        0
    don't care    'c'      'c'      'a'
  - don't care    'c'      'b'      'a'
    -----------------------------------
  = don't care     0        1        0
  &    0          0xFF     0xFF     0xFF
    -----------------------------------
  =    0           0        1        0   --> 256
Community
  • 1
  • 1
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
1

I cannot find a statement about exact value of non-zero return value from memcmp beside it's sign (either on man pages or Open Group pages):

The sign of a non-zero return value shall be determined by the sign of the difference between the values of the first pair of bytes (both interpreted as type unsigned char) that differ in the objects being compared.

RETURN VALUE

The memcmp() function shall return an integer greater than, equal to, or less than 0, if the object pointed to by s1 is greater than, equal to, or less than the object pointed to by s2, respectively.

Nothing says here it is the decimal difference between characters.

4pie0
  • 29,204
  • 9
  • 82
  • 118
  • "difference between the values of the first pair of bytes ... that differ in the objects being compared." It states "difference between the values". How else would I find a difference if not with decimals/integers? – Kbam7 May 12 '16 at 15:48
  • @Kbam7 the [hamming distance](https://en.wikipedia.org/wiki/Hamming_distance) maybe, or [Levenshtein distance](https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance) – 4pie0 May 12 '16 at 15:50
  • @Kbam7: 1) C does not have "decimal" variables. No wonder, as the language relies on binary computers. Decimal is just an input/output representation. `char` and realted are also integer types. – too honest for this site May 12 '16 at 15:52
  • Yes, I understand that char and related types stem from integers. i.e 'a' == 97. But now, can anyone tell me why I am getting '256' when I use memcmp on Linux? And why my friend gets '1' when he uses memcmp on his Mac? We both used the same code but get different answers? – Kbam7 May 12 '16 at 15:58
  • @Kbam7 "How else would I find a difference if not with decimals/integers? " --> with [compare](http://stackoverflow.com/questions/37191355/why-does-memcmp-return-256-for-a-difference-of-1#comment61916751_37191355). – chux - Reinstate Monica May 12 '16 at 16:05
  • @Kbam7 because this is not defined what integer value is returned your C implementation (compiler, library) may return 256 and implmentation of memcmp function on your friend's machine may return 1, both are greater than zero which must hold, as it is defined – 4pie0 May 12 '16 at 21:28