-1

So I am doing some simple file I/O in c++ and I notice this behaviour, not sure if I am forgetting something about the extraction operator and chars

Note that the file format in Unix.

ifstream infile("test.txt");
string line;
while(getline(infile, line)){
  istringstream iss(line);
  
  **<type>** a;
  
  for(...){
     iss >> a;
  }

  if(iss.eof()) 
    cout << "FAIL" << endl;

}

Say that the input file test.txt looks like this and the <type> of a is int

$ is the newline character (:set line)

100 100 100$
100 100 100$

what I notice is that after the first line is read, EOF is set true;

If the input file is like so, and the <type> of a is char:

a b c$
a b c$

Then the Code behaves perfectly as expected.

From what I understand about File I/O and the extraction operator, the leading spaces are ignored, and the carriage lands on the character after the input is taken out of the input stringstream iss. So in both cases, at the end of each stringstream the carriage lands on the newline character, and it shouldn't be an EOF.

Changing the <type> of a to string had similar failure as <type> = int

BTW failbit is not set,
at the end: good = 0
fail = 0
eof = 1

DR. Palson_PH.d
  • 301
  • 1
  • 3
  • 11
  • I strongly recommend crafting a [mre] what you seem to be describing is pretty out there, so we'll need a concrete example of the behaviour so we can reproduce and experiment. – user4581301 Nov 11 '21 at 00:14
  • On closer look, `if(iss.eof())` isn't looking at the EOF bit of the file stream, it's looking at the string stream that the `for(...)` may have read to the end and legitimately be EOF at the end of the first line. Why it wouldn't report EOF for the `char` version, I don't yet know. We definitely need a complete example to be able to see exactly what you've run into. – user4581301 Nov 11 '21 at 00:20
  • The for loop is set to read all 3 space delimited values in the line, will get back to you with a reproducible example, thanks. – DR. Palson_PH.d Nov 11 '21 at 00:24
  • `100 100 100` fed to code looking for 3 space-delimited tokens will produce an EOF. `getline` provides a `string` containing the line without the newline, so the first `>>` will read up to the space, the second reads to the next space, and the third plows into the end of the `stringstream`. The `char` version extracts exactly one `char`, so it doesn't hit the end of the stream. It reads a `char`, skips the whitespace reading to the next `char` skips the whitespace reading to the next `char` and then stops, satisfied without trying to read past the end. – user4581301 Nov 11 '21 at 00:36
  • Interesting, I see, so iss >> a (a is int) reads: 100 >> 100 >> 100, where as iss >> a (a is char) reads: a >> b >> c, does not touch the . This is interesting behaviour that threw me for a loop. I wasn't able to find documentation on this anywhere, do you have a source? I think you should make an official answer, so that this can be better documented here. I understand the mechanics now (the int/string version reach because >> operator doesn't know when to stop looking down the stringstream as there is no space after the last item except for an – DR. Palson_PH.d Nov 11 '21 at 00:46
  • [Character case](https://en.cppreference.com/w/cpp/io/basic_istream/operator_gtgt2), [`int` case](https://en.cppreference.com/w/cpp/io/basic_istream/operator_gtgt), [`string` case](https://en.cppreference.com/w/cpp/string/basic_string/operator_ltltgtgt) – user4581301 Nov 11 '21 at 00:54
  • These are ok, but don't make this nuanced behaviour crystal clear. I think this question is good to document this behaviour for future programmers, please feel free to change the title to make it more searchable, I wasn't able to think of anything better :/ – DR. Palson_PH.d Nov 11 '21 at 01:02
  • That's just documentation links since you asked for a source. I left them out of the answer because they read like Martian. – user4581301 Nov 11 '21 at 01:04

1 Answers1

2

getline has extracted and discarded the newline, so line contains 100 100 100, not 100 100 100$, where $ is representing the newline. This means reading all three tokens from the line with a stringstream and the >> operator may reach the EOF and produce the FAIL message.

iss >> a; when a is an int or a string will skip all preceding whitespace and then continue extracting until it reaches a character that can't possibly be part of an int or is whitespace or is the end of the stream. On the third >> from the stream, the end of the stream stops the extraction and the stream's EOF flag is set.

iss >> a; when a is an char will skip all preceding whitespace and then extract exactly one character. In this case the third >> will extract the final character and stop before seeing the end of the stream and without setting the EOF flag.

user4581301
  • 33,082
  • 7
  • 33
  • 54