0

So I want through the most topics related to using strtol in respect of inputs, but didn't find the exact issue I look for.

Basically, I have to build some data structure in C (m-ary tree), and we receive the data of vertices from lines of the text file.

Now, for example row number 7 represents vertex number 5, and this row has to include only positive integers including zero, delimited by only one whitespace, these integers are actually the children of this row's vertex.

Now, I'm of course using strtol to get the integral values, and going through the next ones and receive them, but it doesn't cover all the possible scenarios the could appear.

The row can be invalid, for example "1.2.3" which is invalid, but my strtol will get the first value, 1, going to the next one, will consider it as 0, which is wrong.

One more issue that can appear is maybe the number is not integer but a double like - 2.5 - then again it will consider this as 2 and 0 where it has to be an error of invalid input.

So I thought to use strtok to split the integer fields by whitespace and then call on each one strtol, but again the row can be maybe 2.5 3.5 and strtol will treat it as 2 0 3 0 which is incorrect.

And I'm not sure it's so elegant to use both strtol and strtok.

Can you maybe explain me some other way to treat this issue well?

Assumptions: 1. the maximum length of a row is 1024 chars. 2. Every row represents a vertex and every value separated by a single whitespace. What we can't assume: 1. The values are not necessarily valid, We are required to validate that the value we have is an unsigned integer. For example, the input "1.2.3" is invalid. 2. Duplicate vertex in one line may appear and it is considered as an invalid input. The whole handling of reading from files etc. has to fit in Windows and Linux. In case of invalid input, we have to print to stderr "Invalid input\n" and return EXIT_FAILURE value.

HelpMe
  • 91
  • 10
  • 1
    What do you want your program to do in cases of invalid input? – CH. Nov 18 '19 at 11:02
  • 2
    TL:DR; Can you please provide some nicely formatted data in and out and some code to illustrate what you mean? – Weather Vane Nov 18 '19 at 11:02
  • 2
    Why does `strtol` fail to satisy with input `"1.2.3"`? The function returns a pointer to the next character. When you look at that character, plainly the `'.'` is out of place. – Weather Vane Nov 18 '19 at 11:04
  • @user3121023 Yes - I actually would like to return a value and terminate the program... – HelpMe Nov 18 '19 at 13:09
  • @user3121023 One regular whitespace is allowed between any two integers. And what row at this index should actually represent? – HelpMe Nov 18 '19 at 13:22
  • It doesn't treat all the scenarios. We may get an input like: "5 4" which is valid. – HelpMe Nov 18 '19 at 13:25
  • 1
    @HelpMe: If you want a solution to "treat all the scenarios", you need to provide us with a description (table?) of "all the scenarios" and what the expected / desired outcome for each of these scenarios would be. Right now we're mostly guessing, as I did in my answer. For one, is a `double` valid input? (Then why are we talking about `strtol`?) Or not? (Then why do you mention it?) Is there a fixed number of entries expected per line of input? And so on. – DevSolar Nov 18 '19 at 14:41
  • Both `strtol` and `strtok` will fail to detect "delimited by only one whitespace". – chux - Reinstate Monica Nov 18 '19 at 15:33
  • @DevSolar Assumptions: 1. the maximum length of a row is 1024 chars. 2. Every row represents a vertex and every value separated by a single whitespace. What we can't assume: 1. The values are not necessarily valid, We are required to validate that the value we have is an unsigned integer. For example, the input "1.2.3" is invalid. – HelpMe Nov 18 '19 at 16:26
  • What is supposed to happen if there is invalid input? – DevSolar Nov 18 '19 at 16:33
  • @DevSolar We have to print to stderr: "Invalid input\n" and return EXIT_FAILURE value. – HelpMe Nov 18 '19 at 16:41
  • @DevSolar In addition, now the staff gives us other requirements: Duplicate vertex in one line may appear and it is considered as an invalid input. The whole handling of reading from files etc. has to fit in Windows and Linux. – HelpMe Nov 18 '19 at 16:48

2 Answers2

2

The row can be invalid, for example "1.2.3" which is invalid, but my strtol will get the first value, 1, going to the next one, will consider it as 0, which is wrong.

The value of endptr (the second parameter to strtol) will tell you that there is a problem with your input (as no input has been consumed, and endptr not been advanced). Issue an error message.

One more issue that can appear is maybe the number is not integer but a double like - 2.5 - then again it will consider this as 2 and 0 where it has to be an error of invalid input.

Would a double be valid input? Then use strtod. If only integers are allowed, see first point -- check endptr and issue an error message.

DevSolar
  • 67,862
  • 21
  • 134
  • 209
2

Strategy:

  1. Remove trailing carriage return and or newline.
  2. Test for valid characters with strspn.
  3. Test for more than one space with strstr.
  4. Parse values with strtol.

Code:

while ( fgets ( row, sizeof row, file)) {
    row[strcspn ( row, "\r\n")] = 0;//remove trailing carriage return and or newline
    size_t index = strspn ( row, " 0123456789");//find index of non-matching character
    if ( row[index]) {//not the terminating zero
        exit ( EXIT_FAILURE);
    }
    if ( strstr ( row, "  ")) {//only single spaces allowed
        exit ( EXIT_FAILURE);
    }
    char *last = row;
    char *item = row;
    while ( *last) {//not the terminating zero
        long int value = strtol ( item, &last, 10);
        if ( item == last) {
            break;
        }
        printf ( "%ld\n", value);
        item = last;
    }
}
Roberto Caboni
  • 7,252
  • 10
  • 25
  • 39
user3121023
  • 8,181
  • 5
  • 18
  • 16