6

Using this structure:

typedef struct sProduct{
  int code;
  char description[40];
  int price;
};

I want to read a txt file with this format:

1,Vino Malbec,12

where the format is: code,description,price. But I'm having problems to read the description when it has a space.

I tried this:

fscanf(file,"%d,%[^\n],%d\n",&p.code,&p.description,&p.price);

The code is being saved ok, but then in description is being saved Vino Malbec,12, when I only want to save Vino Malbec because 12 is the price.

Any help? Thanks!

Mati Tucci
  • 2,826
  • 5
  • 28
  • 40

2 Answers2

9

The main issue is "%[^\n],". The "%[^\n]" scans in all except the '\n', so description scans in the ','. Code needs to stop scanning into description when a comma is encountered.

With line orientated file data, 1st read 1 line at a time.

char buf[100];
if (fgets(buf, sizeof buf, file) == NULL) Handle_EOForIOError();

Then scan it. Use %39[^,] to not scan in ',' and limit width to 39 char.

int cnt = sscanf(buf,"%d , %39[^,],%d", &p.code, p.description, &p.price);
if (cnt != 3) Handle_IllFormattedData();

Another nifty trick is: use " %n" to record the end of parsing.

int n = 0;
sscanf(buf,"%d , %39[^,],%d %n", &p.code, p.description, &p.price, &n);
if (n == 0 || buf[n]) Handle_IllFormattedData_or_ExtraData();

[Edit]

Simplification: @user3386109

Correction: @cool-guy remove &

Community
  • 1
  • 1
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • 1
    @user3386109 True that including `\n` in `"%39[^\n,]"` is really not reeded: coudl be `"%39[^,]"`. IAC, the check of the return values `fgets()` and `sscanf()` will insure all items exists including the 2nd `','`. – chux - Reinstate Monica Feb 25 '15 at 05:16
  • 1
    @Sridhar Omitting the 39 in `"%39[^,]"` leads to a problem when the `description` input exceeds 39 `char`. Liberal user of `" "` in a `sscanf()` allows for usually acceptable additional spacing, hence my suggested spaces in the format - though mostly optional. – chux - Reinstate Monica Feb 25 '15 at 05:23
2

@chux has already posted a good answer on this, but if you need to use fscanf, use

if(fscanf(file,"%d,%39[^,],%d",&p.code,p.description,&p.price)!=3)
    Handle_Bad_Data();

Here, the fscanf first scans a number(%d) and puts it in p.code. Then, it scans a comma(and discards it) and then scans a maximum of 39 characters or until a comma and puts everything in p.description. Then, it scans a comma(and discards it) and then, a number(%d) and puts it in p.price.

Spikatrix
  • 20,225
  • 7
  • 37
  • 83
  • 1
    Nice approach to using `fscanf()` directly. There are subtle differences between `fscanf()` vs. `fgets()/sscanf()` 1) Should file data contain an embedded null character `'\0'`, `fscanf(..."%39[^,]"...)` will consume the `'\0'` and continue. OTOH, so will `fgets()` but its better detected with `sscanf()`. 2) When the input is wrong (bad syntax), the `fgets()/sscanf()` approach is easier to re-sync and continue on than `fscanf()`. Yet on the `fscanf()` plus side - it is simpler for quick coding. – chux - Reinstate Monica Feb 25 '15 at 05:37
  • I didn't know that! What does an embedded `\0` mean? – Spikatrix Feb 25 '15 at 06:57
  • 1
    Say a file has the unusual line: "1,abc\0xyz,12\n". `fscanf( ..."%d,%39[^,],%d"...)` scans 7 `char` into `description` and then appends a terminating null character `'\0'` so it has the contents `"abc"`, a null character , `"xyz"` and a final null character. Upon printing `description`, only `"abc"` appears. With `fgets()`, the same `"1,abc\0xyz,12\n".` will be read into a buffer, but `sscanf()` will stop scanning on the first `'\0'` and so only see `"1,abc".` and fail the `sscanf()` parsing. – chux - Reinstate Monica Feb 25 '15 at 15:15
  • Embedded `'\0'` are _not_ common. but they sporadically occur and also are a potential nefarious way for hackers to break code. – chux - Reinstate Monica Feb 25 '15 at 15:26