1

I am trying to figure out the function atoi() from stdlib.h. As per K&R, it looks like the following:

int atoi(char s[]) {
   int n, i;

   n = 0;
   for (i = 0; s[i] >= '0' && s[i] <= '9'; ++i)
        n = 10 * n + (s[i] - '0');
   return n;
}

As I understand, the atoi() function, from stdlib.h, should get the string of any characters as an input and literally output only digits like the following:

Code 1:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    printf("%i", atoi(" -123junk"));
    printf("%i", atoi("0"));
    printf("%i", atoi("junk"));         // no conversion can be performed
    printf("%i", atoi("2147483648"));   // UB: out of range of int
}

Output:

-123
0
0
-2147483648

However, in my program, I am trying to provide the string as input and get the digits only as output:

Code 2:

#include <stdio.h>
#include <stdlib.h>

int main() {
    int c, i;
    char s[i];
    for (i = 0; (c = getchar()) != '\n'; ++i)
       s[i] = c;
    s[i] = '\n';
    s[++i] = '\0';
    printf("%i", atoi(s));
}

When executing on machine:

pi@host:~/new$ cc new.c
pi@host:~/new$ a.out
21412421How it is
0

I am getting incorrect values as an output.

Questions:

1) As per Code 1, printf("%i", atoi(" -123junk")), it looks like the function atoi() can receive string as an argument and return single integer number that represents concatenation of numeric values of the digits in the input string, isn't it?

2) What does the atoi(), from stdlib.h, return?

3) How to fix the function main() from Code 2, in order to get characters from stdin, write into the array, call the function atoi(), provide array as an argument, receive, literally, digits in output?

4) As per K&R example of the function atoi(), "the expression (s[i] - '0') is the numeric value of the character stored in s[i]", but, why do we need to add 10 * n part, moreover, to assign n to 0 before it, as n * 0 = 0, hence, n * 10 = 0, which means, that the n * 10 will always be zero in the assignment statement n = 10 * n + (s[i] - '0');, hence, why do we need it?

5) If the atoi() is returning one integer, how can the result of printf("%i", atoi(" -123junk")); return a string with numbers -123? In other words, am I understand it correctly: The function atoi() is called within function printf() with the " -123junk" as an argument. Function atoi() returns integer, ONLY one integer, which is something like n = 10 * n + (s[i] - '0'); than, how it can be expanded in -123??

readonly
  • 89
  • 7
  • 2
    What do you think the size of the `s` array? – Eugene Sh. Jan 16 '19 at 20:45
  • Why not use `fgets()` to read a line of input, instead of writing your own loop? – Barmar Jan 16 '19 at 20:57
  • @EugeneSh. I have run `sizeof(s)`, noticed it is showing as 0, but, the point is, why does `atoi(s)` is showing as 0, because, as `printf("%i", atoi(" -123junk"))`, it looks like, "-123junk" is the same string as `s` in `atoi(s)`, isn't it? – readonly Jan 16 '19 at 20:58
  • 1
    The size of `s` is indeterminate, since you never assigned anything to `i` before declaring `char s[i]`. – Barmar Jan 16 '19 at 20:59
  • 1
    @readonly The main point here is that by using an array like your program has an *undefined behavior* and may produce *any* results. – Eugene Sh. Jan 16 '19 at 21:01
  • You multiply by 10 to accumulate each digit in the next power of 10 in the final number. If the number is `123`, you start with `n = 0`. Then you multiply by 10 and add 1 to get 1. Then you multiply by 10 and add 2 to get 12. Multiply this by 10 and add 3 and you get 123. This is grade school arithmetic. – Barmar Jan 16 '19 at 21:07
  • @Barmar, but, when the function `atoi` has been called once, the `n=0`, then `n = 10 * n + (s[i] - '0');`, however, then, the loop ends and the call to this function from the function `main()` is end, and during the next call to the same function, the compiler should read from top to bottom, hence, assignment `n=0` would be seen, right??? – readonly Jan 16 '19 at 23:02
  • 1
    What are you talking about? `atoi` isn't called once for each digit. It's called once for the whole string, and it loops over the digits within the string. Before adding each digit into the result it multiplies by 10. – Barmar Jan 16 '19 at 23:23
  • The posted code will output everything, jammed together, on a single line. Suggest, changing the format string: `"%d"` to `"%d\n"` in each call to `printf()` – user3629249 Jan 17 '19 at 05:51
  • BTW: the K&R book is decades old and doesn't really match the current c language – user3629249 Jan 17 '19 at 05:56
  • suggest reading the MAN page for `atoi()` as it (amongst other things) will tell you what that function returns – user3629249 Jan 17 '19 at 05:58
  • regarding: `int c, i; char s[i];` since the variable `i` has not been initialized to a known value, it is undefined behavior to access that variable when declaring the array `s[]` – user3629249 Jan 17 '19 at 06:01
  • @user3629249 What would you suggest as an alternative to K&R, as I believe, many of the examples given in this book are very ease to follow just using book and fast to learn? – readonly Jan 17 '19 at 08:33
  • @Barmar Understood. – readonly Jan 17 '19 at 08:52
  • @user3629249 It looks like it's been updated somewhat, since the example he quoted uses modern parameter declarations rather than the original syntax. – Barmar Jan 17 '19 at 17:20
  • @readonly See [The Definitive C Book Guide and List](https://stackoverflow.com/questions/562303/the-definitive-c-book-guide-and-list) – Barmar Jan 17 '19 at 17:21

1 Answers1

1

Q1) your version of atoi() is too simple, the standard version ignores leading whitespace characters and handles an optional sign before the number. atoi(" -123junk") should evaluate to -123.

Q2) atoi is a standard function defined with the prototype int atoi(const char *s); it returns an integer.

Q3) There are a few mistakes in Code 2:

  • you specify the size of the char array s as i, which is uninitialized. you should instead define the array with a reasonably large value such as 64,
  • you should test for potential buffer overflow in the loop,
  • you should check for EOF to stop the loop in case end of file is encountered without a newline.

Here is a modified version:

#include <stdio.h>
#include <stdlib.h>

int main() {
    char s[64];
    size_t i;
    int c;
    for (i = 0; i < sizeof(s) - 1 && (c = getchar()) != EOF;) {
       s[i++] = c;
       if (c == '\n')
           break;
    }
    s[i] = '\0';
    printf("%i", atoi(s));
    return 0;
}

Q4) the expression n = 10 * n + (s[i] - '0') is evaluated for each new digit found in the string. It is indeed slightly inefficient to multiply the current value by 10 as long as no non-zero digit has been encountered, but writing the function this way is simple.

To avoid these useless multiplications, here is an alternative:

int atoi(const char *s) {
    int n = 0;
    size_t i = 0;

    while (s[i] == '0')
        i++;
    if (s[i] >= '1' && s[i] <= '9') {
        n = s[i++] - '0';
        while (s[i] >= '0' && s[i] <= '9')
            n = 10 * n + (s[i++] - '0');
    }
    return n;
}

But this function is more cumbersome and might actually be less efficient that the simple version. Try and benchmark both on your system.

For completeness, here is a full portable version using ctype.h> that handles optional initial whitespace, and an optional sign. It also handles overflow with defined behavior, although the standard version of atoi() is not required to do so.

#include <limits.h>
#include <stdio.h>

int atoi(const char *s) {
    int n = 0, d;

    /* skip optional initial white space */
    while (isspace((unsigned char)*s))
        s++;
    if (*s == '-') {
        /* convert negative number */
        s++;
        while (isdigit((unsigned char)*s)) {
            d = (*s++ - '0');
            /* check for potential arithmetic overflow */
            if (n < INT_MIN / 10 || (n == INT_MIN / 10 && -d < INT_MIN % 10)) {
                n = INT_MIN;
                break;
            }
            n = n * 10 - d;
        }
    } else {
        /* ignore optional positive sign */
        if (*s == '+')
            s++;
        while (isdigit((unsigned char)*s)) {
            d = (*s++ - '0');
            /* check for potential arithmetic overflow */
            if (n > INT_MAX / 10 || (n == INT_MAX / 10 && d > INT_MAX % 10)) {
                n = INT_MAX;
                break;
            }
            n = n * 10 + d;
        }
    }
    return n;
}

int main(int argc, char *argv[]) {
    int i, n;

    for (i = 1; i < argc; i++) {
        n = atoi(argv[i]);
        printf("\"%s\" -> %d\n", argv[i], n);
    }
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • Note: `if (s[i] >= '1' && s[i] <= '9') { n = s[i++] - '0';` not needed in last code - can be dropped. – chux - Reinstate Monica Jan 16 '19 at 21:22
  • @chqrlie Have just tried to run the code you have mentioned, and, noticed, that when I use string like `"123242someletters12321"`as an input, I am getting only first digits as an output, is it expected?? – readonly Jan 16 '19 at 23:05
  • 1
    @readonly - what happens when the test condition in `while (s[i] >= '0' && s[i] <= '9')` is no longer `TRUE`? – David C. Rankin Jan 16 '19 at 23:19
  • @chux: of course the code you mention can be dropped, but it does save a multiplication, which is precisely the purpose of this alternate function. – chqrlie Jan 17 '19 at 04:11
  • regarding: `while (s[i] == '0')` what if the leading characters were white space? what if the leading character was '+'. what if the leading character was '-'? – user3629249 Jan 17 '19 at 06:10
  • regarding: `while (s[i] >= '0' && s[i] <= '9')` This would be much better written as: `#include ` and `while( isdigit( s[i] ) )` – user3629249 Jan 17 '19 at 06:12
  • @DavidC.Rankin I suppose, the loop ends, hence, as per mentioned input,"123242someletters12321", everything after "123242" would be omitted, right? – readonly Jan 17 '19 at 08:25
  • @user3629249: yes, `` macros could be used, but the correct way to do so is to write `while( isdigit((unsigned char)s[i]))` as `isdigit()` is only defined for a subset of type `int` comprising the values of of `unsigned char` and the value `EOF`. passing it negative `char` values has undefined behavior. – chqrlie Jan 18 '19 at 05:49