-1

I am trying to parse some simply formatted text and create a simple data structure of text/numeric records using numeric and short string values. I have debugged and located the problem; it's down to sscanf() not reading the values into my variables using a specific format string (other format strings in the program work well). I have created a simple text file to see what's happening.

Code is as follows:

char *idNumber = (char *)malloc(sizeof (char*));
  char *partNumber = (char *)malloc(sizeof (char*));
  int amountItems = 0;
  double unitPrice = 0;

  char *line1 = "Govan, Guthrie (N210) AX299 x 6 $149.94";
  char *line2 = "Mustaine, Dave (N106) AX350N x 2 $63.98";
  char *line3 = "Van Halen, Edward (N1402) AV2814 x 10 $34.90";


  sscanf(line1, "%*s, %*s (%s) %s x %d $%lf",  idNumber, partNumber,
     &amountItems, &unitPrice);

  printf("%s, %s, %d, %f\n", idNumber, partNumber, amountItems,     unitPrice);


  sscanf(line2, "%*s, %*s (%s) %s x %d $%lf", idNumber, partNumber,
     &amountItems, &unitPrice);
  printf("%s, %s, %d, %lf\n", idNumber, partNumber, amountItems,     unitPrice);

  sscanf(line3, "%*s, %*s (%s) %s x %d $%lf", idNumber, partNumber,
     &amountItems, &unitPrice);
  printf("%s, %s, %d, %lf\n", idNumber, partNumber, amountItems,     unitPrice);

I am interested in the following fields, with the rest being ignored. For instance, in record:

"Govan, Guthrie (N210) AX299 x 6 $149.94"

I want N210, AX299, 6, and 149.94 in my variables, in that order.

Result is as follows:

andrew@levin-Inspiron-3650:~/Desktop/schoolwork/project2$ ./a.out

, , 0, 0.000000
, , 0, 0.000000
, , 0, 0.000000

Expected output is:

N210, AX299, 6, 149.94
N106, AX350N, 2, 63.98
N1402, AV2814, 10, 34.90

Please share help!

This is not code directly from my program but a "helper" file I created on the side just to debug this one issue very simply without having to invoke the entire application!

The following similar code worked well for a different format: Record being:

N210 AX299 6 24.99

in following code:

struct record *current = malloc(sizeof(struct record *));
current->idNumber = (char *)malloc(sizeof (char *) * 8);
current->partNumber = (char *)malloc(sizeof (char *) * 10);
sscanf(line, "%s %s %d %lf", current->idNumber, current->partNumber,
 &(current->amountItems), &(current->unitPrice));

I do not expect this code to be a wealth of C beauty, I am a Java developer this is a C project for community college. But can you help me debug this one sscanf problem.

Thank you!

  • I thought you said “simple" but then I saw the input record and the bits wanted and not wanted and it got “not simple”. – Jonathan Leffler Dec 03 '17 at 02:21
  • You are creating all kinds of memory leaks with that code. If you `malloc` something, you have to `free` it. – David Hoelzer Dec 03 '17 at 02:25
  • They invented these things called functions which you use to avoid writing the same tricky code out three times. The common material goes in the function body; the variable stuff goes in the arguments. – Jonathan Leffler Dec 03 '17 at 02:25
  • duly noted david bowling, david hoelzer. can you help with the sscanf issue? – Andrew Levin Dec 03 '17 at 02:32
  • D. Bowling: I have fixed it and in gdb the values still show as not having been filled by sscanf(), even outside of the printf issue, which I think I have fixed anyway (replaced with new code in OP) – Andrew Levin Dec 03 '17 at 02:53
  • sorry D. bowling i now know, this was my first post... changed type to double still no warnings. fixed char pointer initialization to your code, still nothing – Andrew Levin Dec 03 '17 at 03:08

2 Answers2

4

There is a problem with the dynamic allocation here. The line char *idNumber = (char *)malloc(sizeof (char*)); allocates space for a pointer to char, not for a char, or an array of chars. This should be something like:

char *idNumber = malloc(sizeof (char) * 256);

or:

char *idNumber = malloc(sizeof *idNumber * 256);

Note that there is no need to cast the result of malloc() in C. The second version is a very idiomatic way to do this in C. By avoiding use of an explicit type in the operand to sizeof, this is easier and less error-prone in the coding, and easier to maintain when types change during the life of the code. But, since sizeof (char) is always 1 in C, this might as well be:

char *idNumber = malloc(256);

No point in being stingy with allocations, and 256 gives plenty of space for input. And please remember to always check that allocation was successful before attempting to use allocated memory; and don't forget to free malloced memory when finished with it.

But, this is not causing the trouble. The problem is that the format string tells sscanf() to match a comma after the first string, yet in the input this comma is consumed by %*s. There is no further match, so sscanf() returns. There is a further problem with %s) consuming the ) at the end of the input string, leaving no closing parenth to match in the format string. And the %s conversion specifier reads strings up to a whitespace character, so "Van Halen," is consumed by %*s %*s, leaving "Edward" for an attempted match with (%s). These sorts of errors are detectable; one should always check the value returned by calls to scanf() family functions to be certain that input is as expected.

The scanset directive can be used to good effect here. This directive: %*[^(]( tells scanf() to match any characters until a ( is encountered, suppressing assignment, and matching the ( in the end before continuing. Then the %255[^)]) directive tells scanf() to match up to 255 characters, until a ) is encountered, storing the results in an array, and matching the ) in the end before continuing. Note the specification of a maximum width here to prevent buffer overflow, and note that room must be left for the \0 terminator which will always be added by scanf().

Here is a modified program which will work as expected:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    char *idNumber = malloc(256);
    if (idNumber == NULL) {
        perror("Allocation failure");
        exit(EXIT_FAILURE);
    }
    char *partNumber = malloc(256);
    if (idNumber == NULL) {
        perror("Allocation failure");
        exit(EXIT_FAILURE);
    }

    int amountItems = 0;
    double unitPrice = 0;

    char *line1 = "Govan, Guthrie (N210) AX299 x 6 $149.94";
    char *line2 = "Mustaine, Dave (N106) AX350N x 2 $63.98";
    char *line3 = "Van Halen, Edward (N1402) AV2814 x 10 $34.90";

    if (sscanf(line1, "%*[^(]( %255[^)]) %255s x %d $%lf",
               idNumber, partNumber, &amountItems, &unitPrice) < 4) {
        fprintf(stderr, "Input error in line1\n");
    } else {
        printf("%s, %s, %d, %f\n",
               idNumber, partNumber, amountItems, unitPrice);
    }

    if (sscanf(line2, "%*[^(]( %255[^)]) %s x %d $%lf",
               idNumber, partNumber, &amountItems, &unitPrice) < 4) {
        fprintf(stderr, "Input error in line2\n");
    } else {
        printf("%s, %s, %d, %f\n",
               idNumber, partNumber, amountItems, unitPrice);
    }

    if (sscanf(line3, "%*[^(]( %255[^)]) %s x %d $%lf",
               idNumber, partNumber, &amountItems, &unitPrice) < 4) {
        fprintf(stderr, "Input error in line3\n");
    } else {
        printf("%s, %s, %d, %f\n",
               idNumber, partNumber, amountItems, unitPrice);
    }

    free(idNumber);
    free(partNumber);

    return 0;
}

Program output:

N210, AX299, 6, 149.940000
N106, AX350N, 2, 63.980000
N1402, AV2814, 10, 34.900000
ad absurdum
  • 19,498
  • 5
  • 37
  • 60
0

Your format strings simply do not match the argument types. Rather than going one-by-one trying to point out every error here, the way to prevent this is to enable compiler warnings. For example if you're using GCC or Clang, add -Wall -Wextra -Werror to your compiler command. Then, the compiler will tell you everything you need to know about format string mismatches.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • Thank you! I have changed the code to accord with the warnings and the code is now edited and replaced in the original post... no warnings now except not using args to main(). Still I am getting unexpected output! Can you help? – Andrew Levin Dec 03 '17 at 02:50
  • @AndrewLevin: Fix ALL the errors! `sscanf(line1, "%*s, %*s (%s) %s x %d $%lf", idNumber, partNumber, &amountItems, &unitPrice);` will not compile if you use the options I told you. – John Zwinck Dec 03 '17 at 02:53
  • I edited the above code and it did compile & didnt give error. i used a simple printf for printing value of argc and one element of argv. compiles without warnings or errors, still same "empty" output – Andrew Levin Dec 03 '17 at 02:57
  • @AndrewLevin: OK, now your bug is that scanf() doesn't work the way you think it does. For how to make it stop at a delimiter, see: https://stackoverflow.com/questions/16014859/sscanf-until-it-reaches-a-comma - you'll want something like `%*[^,],`. – John Zwinck Dec 03 '17 at 03:23
  • `sscanf(line1, "%*[^(](%[^)])%s%*[^x]x%d%*[^$]$%f", idNumber, partNumber, &amountItems, &unitPrice);` Did the trick. All four fields parsed now. `N210, AX299, 6, 149.940002` is the output. Thank you!!! – Andrew Levin Dec 03 '17 at 03:47