using strtol on a string literal causing segmentation fault

Question

I have a string that I get by getline() (more precisely I use a loop and getline to read a file line by line)

Let's say the line is 12|34|

Then I use strtok() to cut it down by substr = strtok(line, "|"); and store them into a string array with a loop, part[index] = substr;

So the part[0] here should be "12" and part[0] is "34" I would like to use strtol, but I have checked that it can't be used on string literal, then I try following code.

char *temp = strdup(part[1]);
char **ptr;
long ret = strtol(temp, ptr, 10);
printf("%x\n", ret);

and when I read the second line, it causes segmentation fault. By how can I really use strtol to convert the string into integer

Side note: At no point did you call `strtol` on a string literal, so I'm not sure where your title came from. Your original contention that `strtol` can't be called on string literals is wrong. — ShadowRanger, Nov 02 '18 at 10:38
While all the observations about the wrong usage of ptr and strtol are right, I doubt this will cause a segmentation fault. Can you post a more complete version of your code? Especially the definition of part[] ? — GermanNerd, Nov 02 '18 at 12:28
@GermanNerd if you pass a non-null pointer to `strtol` it will try to write at this address, so yes, it's very likely to segfault (well, if your system has an MMU) — Jean-François Fabre, Nov 02 '18 at 12:45
@Jean-François Fabre The OP's code declared a pointer to char* . While uninitialized, it is allocated on the stack, and if the compiler accepts the implicit conversion from pointer to char* to char* (as gcc does), strtol() can write to it. Only when dereferencing the uninitialized pointer could you create a segfault. — GermanNerd, Nov 02 '18 at 12:52
yes, the `**ptr` pointer is allocated, but `strtol` _dereferences it_ (reads its value) to write something there (it's a pointer on pointer). So it's undefined behaviour. The value of `ptr` isn't valid for a pointer. Of course `&ptr` is initialized, but that's not the point.* — Jean-François Fabre, Nov 02 '18 at 12:53
@Jean-François Fabre I think we have a misunderstanding here. The problem is in the call to strtol(), where the '&' is missing. So yes, the OP's code could segfault if the newly allocated ptr does not happen to contain null. But the missing initialization in itself is not the problem. — GermanNerd, Nov 02 '18 at 13:04
It's missing an `&` so if you add `&` you have to declare `char *ptr` and initialize it. — Jean-François Fabre, Nov 02 '18 at 13:10
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/183008/discussion-between-germannerd-and-jean-francois-fabre). — GermanNerd, Nov 02 '18 at 13:11
Too bad. strtol wants to store a pointer to char. So we have to give it a pointer to pointer to char, right? As long as the compiler accepts the conversion (and on modern machines, usually all data pointers are the same), even smth. like char ****ptr; .... strtol(....&ptr...) would work. And if you do it the normal way, with a single char *, what would you initialize it to? The pointer's contents will be overwritten by strtol() anyhow. — GermanNerd, Nov 02 '18 at 13:23
@GermanNerd: You *don't* initialize the single `char*`. That's the point. `strtol` sets it for you when you pass the address of that `char*` to it; it doesn't read what's at the destination, so the initial value doesn't matter. As for using `char**` or more and passing `&thatvar`, just because all pointers are the same on most systems doesn't mean it's correct to do it. It needs a `char**`, you must pass the address of an existing `char*`, or you're in undefined behavior territory. Sure, most compilers accept all sorts of undefined behavior (though often with warnings), but that's no excuse. — ShadowRanger, Nov 02 '18 at 14:44
@ShadowRanger I agree. It actually is partly rewording what I wanted to say: As long as you use &ptr in strtol(), it matters nothing if the pointer is uninitialized. To your second point with implicit conversions: If the compiler just throws a warning, I interpret that such that the compiler can and will do the conversion for you. Of course, you could always do an explicit cast, especially to a char pointer. Warnings gone. Don't get me wrong: The OP's code was wrong on two points, but only one of them was critical: the missing '&' in strtol. — GermanNerd, Nov 02 '18 at 14:51

score 3 · Answer 1 · answered Nov 02 '18 at 10:32

the issue is that ptr isn't initialised. So when strtol tries to write at the address ptr, it crashes (or undefined behaviour).

You have to pass the valid address of a pointer to store the last unprocessed char, like:

char *ptr;
long ret = strtol(temp, &ptr, 10);

&ptr is valid and points to the auto variable storage location to ptr

score 3 · Answer 2 · answered Nov 02 '18 at 10:34

You're misusing strtol. It takes a char** because it intends to set the char* it points to, per the man page:

If endptr is not NULL, strtol() stores the address of the first invalid character in *endptr.

By passing it an uninitialized char** you invoke undefined behavior when it tries to dereference it. Change the code to:

char *ptr;  // Place to put the end ptr
long ret = strtol(temp, &ptr, 10);  // Pass address of the location it can set

Alternatively, if you never use ptr, just do:

long ret = strtol(temp, NULL, 10);  // Don't care about end ptr; strtol won't set on NULL

score 3 · Answer 3 · answered Nov 02 '18 at 10:36

char **ptr;
long ret = strtol(temp, ptr, 10);

is wrong. ptr is not initialized and doesn't refer to anything useful.

The second parameter of strtol() must refer to the address of an actual char * value that stores the address of the first non-converted character. Per 7.22.1.3 The strtod, strtof, and strtold functions of the C standard:

A pointer to the final string is stored in the object pointed to by endptr, provided that endptr is not a null pointer.

The proper code would be

char *endptr;
long ret = strtol(temp, &endptr, 10);

or

char *endptr;
char **ptr = &endptr;
long ret = strtol(temp, ptr, 10);

In this case, after the call to strtol() the value in endptr would be the address of the first character that wasn't converted to the resulting long value.

If you don't care what the first non-converted character is:

char **ptr;
long ret = strtol(temp, NULL, 10);

Tezirg · Answer 4 · 2018-11-02T11:08:16.493

2

Here you already have separated your string. So each string contains one long number. The second argument is used to know where the conversion stopped in the string. If you don't need it, pass in NULL

char *temp = strdup(part[1]);
long ret = strtol(temp, NULL, 10);
printf("%lx\n", ret);

In addition printf for long number requires different format flags. Here lx for long hexadecimal.

edited Nov 02 '18 at 11:08

answered Nov 02 '18 at 10:35

Tezirg

1,629
1
10
20

The second argument doesn't *limit* the size of the conversion, it *reports* where it ended. Big difference. – ShadowRanger Nov 02 '18 at 10:50
Yes, I don't know how to read a man apparently. My bad, updated. – Tezirg Nov 02 '18 at 11:10

score 2 · Answer 5 · answered Nov 02 '18 at 10:40

Function strtol(const char *str, char **str_end, int base); will dereference str_end and will do something like *str_end = one_after_end_of_parsed_long; So when you pass a pointer of type char**, which does not point to a valid pointer object that can be modified by strtol then, you'll yield undefined behaviour.

You'd rather write

char *ptr;  // space for taking on a pointer value
long ret = strtol(temp, &ptr, 10);

or (not the preferred variant):

char **ptr = malloc(sizeof(char*));
long ret = strtol(temp, ptr, 10);
...
free(*ptr);

score 0 · Answer 6 · answered Nov 02 '18 at 17:26

There is absolutely no need to use strtok() at all, because strtol() sets the pointer pointed by the second parameter to point to the character following the number parsed.

A complete example program:

#define  _POSIX_C_SOURCE  200809L
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>

int main(void)
{
    char    *line_ptr = NULL;
    size_t   line_max = 0;
    ssize_t  line_len;

    long    *number      = NULL;
    size_t   numbers     = 0;
    size_t   numbers_max = 0;

    char    *curr, *next, *ends;
    long     temp;
    size_t   i;

    while (1) {

        line_len = getline(&line_ptr, &line_max, stdin);
        if (line_len < 1)
            break;

        curr = line_ptr;
        ends = line_ptr + line_len;

        numbers = 0;
        while (1) {

            /* Parse next long. */
            next = curr;
            errno = 0;
            temp = strtol(curr, &next, 0);
            if (errno)
                break;
            if (next == curr)
                break;

            /* Need to grow number array first? */
            if (numbers >= numbers_max) {
                size_t  temp_max = (numbers | 1023) + 1025 - 16;
                long   *temp_ptr;

                temp_ptr = realloc(number, temp_max * sizeof number[0]);
                if (!temp_ptr) {
                    fprintf(stderr, "Out of memory.\n");
                    exit(EXIT_FAILURE);
                }

                numbers_max = temp_max;
                number      = temp_ptr;
            }

            /* Save parsed number. */
            number[numbers++] = temp;

            /* Skip trailing whitespace, */
            curr = next;
            while (curr < ends && (*curr == '\t' || *curr == '\n' || *curr == '\v' ||
                                   *curr == '\f' || *curr == '\r' || *curr == ' '))
                curr++;

            /* Skip separator. */
            if (*curr == '|')
                curr++;
            else
                break; /* No separator, so that was the final number. */
        }

        printf("Parsed %zu longs:", numbers);
        for (i = 0; i < numbers; i++)
            printf(" %ld", number[i]);
        printf("\n");
        fflush(stdout);
    }

    if (ferror(in)) {
        fprintf(stderr, "Error reading standard input.\n");
        exit(EXIT_FAILURE);
    }

    free(line_ptr);
    line_ptr = NULL;
    line_max = 0;

    free(number);
    number = NULL;
    numbers = 0;
    numbers_max = 0;

    return EXIT_SUCCESS;
}

Other than the available memory, this program has no limits wrt. line length or the amount of numbers it stores in the array. The growth policy for the number array is funky (just my style); feel free to replace it with anything you prefer. Just make sure temp_max is at least numbers + 1. Making it larger means you allocate more at once, and therefore do fewer "slow" realloc() calls.

The outer while loop iterates over lines read from standard input.

The inner while loop parses the longs from that line, separated by a pipe character |. strtol() ignores leading whitespace. In case there is whitespace between the number and the following pipe character, we need to explicitly skip that; you could also use just while (curr < ends && isspace(*curr)) curr++; for that.

If you want to collect all the longs into a single array, rather than per line, just omit the numbers = 0; before the inner while loop. (And move printing out the numbers after the outer while loop.)

The actual conversion,

next = curr;
errno = 0;
temp = strtol(curr, &next, 0);
if (errno)
    break; /* errno == ERANGE; number too large in magnitude! */
if (next == curr)
    break; /* end of input, no number */

relies on the fact that if the number to be converted is too large in magnitude, strtol() will set errno = ERANGE and return LONG_MIN (if the number in the string was negative) or LONG_MAX (if positive). To detect that, we must set errno to zero first. If the string is empty (or there is a stray nul char, \0, in the line), strtol() will return 0 with next == curr.

using strtol on a string literal causing segmentation fault

6 Answers6