2

I am trying to replicate the strcmp() function from the string.h library and here is my code

/**
 * string_compare - this function compares two strings pointed
 * by s1 and s2. Is a replica of the strcmp from the string.h library
 * @s1: The first string to be compared
 * @s2: The second string to be compared
 *
 * Return: On success, it returns:
 * 0 if s1 is equal to s2
 * negative value if s1 is less that s2
 * positive value if s1 is greater than s2
 */

int string_compare(char *s1, char *s2)
{
        int sum = 0, i;

        for (i = 0; s1[i] != '\0' && s2[i] != '\0'; i++)
                sum += (s1[i] - s2[i]);
        for ( ; s1[i] != '\0'; i++)
                sum += (s1[i] - 0);
        for ( ; s2[i] != '\0'; i++)
                sum += (0 - s2[i]);

  

        return (sum);
}

I tried my function using this sample code:

#include <stdio.h>

int main(void)
{
    char s1[] = "Hello";
    char s2[] = "World!";

    printf("%d\n", string_compare(s1, s2));
    printf("%d\n", string_compare(s2, s1));
    printf("%d\n", string_compare(s1, s1));
    return (0);
}

And I get the following output,

-53
-500
0

But I should be getting:

-15
15
0

Why am I getting such a result??

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
Leuel Asfaw
  • 316
  • 1
  • 14
  • 1
    Have you stepped through your code with a debugger? – Stephen Newell Aug 12 '22 at 17:16
  • 4
    Your approach does not make much sense. Why do you compare all characters and don't stop at first difference? With your function you would get 0 for `"ab"` and `"ba"` or for `"bbb"` and `"abc"` – Gerhardh Aug 12 '22 at 17:17
  • I expect this to crash on any strings that are different lengths, because the conjunction of the test, `s1[i] != '\0' && s2[i] != '\0'`, allows buffer overrun. Also, you seem to be making a checksum, which fails to differentiate "ab" from "ba". – Neil Aug 12 '22 at 18:57
  • Regardless which approach you take, you need to check `s1` and `s2` for `NULL` before you start iterating over each. E.g. `if (!s1 && !s2) return 0; if (s1 && !s2) return 1; if (!s1 && s2) return -1;` – David C. Rankin Aug 12 '22 at 21:33
  • `string_compare()` will give bizarre results if one of the parameters happens to be a null pointer... Look into `assert()`, as a minimum, to trap these errors during development. – Fe2O3 Aug 12 '22 at 23:11

3 Answers3

3

This approach is incorrect.

Let's assume that the first string is "B" and the second string is "AB".

It is evident that the first string is greater than the second string in the lexicographical order.

But the result will be negative due to this for loop

    for ( ; s2[i] != '\0'; i++)
            sum += (0 - s2[i]);

though the function shall return a positive value.

Moreover there can occur an overflow for the variable sum of the type int.

Also the function should be declared at least like

int string_compare( const char *s1, const char *s2);

because passed strings are not changed within the function.

The function can be defined the following way

int string_compare( const char *s1, const char *s2 )
{
    while ( *s1 && *s1 == *s2 )
    {
        ++s1;
        ++s2;
    }

    return ( unsigned char )*s1 - ( unsigned char )*s2;
} 
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • I am sorry, but I am trying to understand your code,and what do you mean when you put `while ( *s1 && *s1 == *s2)`. I am new to pointers, and programming in general so, can you please explain it for me?? @Vlad from Moscow – Leuel Asfaw Aug 12 '22 at 18:02
  • @LeuelAsfaw `while (*s1)` means *"while we're not at the end of string 1"*, and `while (*s1 == *s2)` means *"while the current character of string 1 is equal to the current character of string 2"*. Put it all together, and you have *"while (we're not at the end of string1 AND the current characters are equal) { increment the pointers to point to the next character }"* – user3386109 Aug 12 '22 at 18:11
  • 1
    @chqrlie You are right. I was not attentive.:) – Vlad from Moscow Aug 12 '22 at 20:13
3

You are overcomplicating very simple function.

#define UC unsigned char

int mystrcmp(const char *s1, const char *s2)
{
    int result;
    while(!(result = (UC)*s1 - (UC)*s2++) && *s1++);
    return result;    
}

0___________
  • 60,014
  • 4
  • 34
  • 74
0

Strings in C are arrays of characters terminated with a null character (\0).

When you pass a string to a function, you are passing a pointer to its first element. That pointer is passed by value. You can modify that pointer within the function without any side-effects on the string it points to, as long as you don't dereference and assign to the address it points to.

That's why the pointer math from 0___________'s answer works.

int mystrcmp1(const char *s1, const char *s2) {
    int result = 0;
    while(!(result = *s1 - *s2++) && *s1++);
    return result;     
} 

*s1++ could be rewritten as *(s1++) to disambiguate. s1++ returns the current pointer to the beginning of the first string, and then increments the pointer so it points to the next character. That pointer is then dereferenced to give us the character. The same happens with the s2 pointer.

Then we're comparing them by subtraction. If they're the same, we get 0, which in C is false in a boolean context. This result is assigned to result.

We can now see that the loop continues while corresponding characters in the two strings are equal and while dereferencing s1 does not give us the null terminator.

When the loop continues it means there was either a difference or we reached the end of the first string.

The difference will be stored in result, which the function returns.

0___________
  • 60,014
  • 4
  • 34
  • 74
Chris
  • 26,361
  • 5
  • 21
  • 42