6

the following code is not behaving like I would expect. Please help me understand how it works.

#include <algorithm>
#include <iterator>
#include <fstream>
#include <vector>
#include <string>
using namespace std;

struct user
{
        string name;
        string age;
        string id;
};

istream& operator>>(istream& is, user& s)
{
        getline(is, s.name, ':');
        getline(is, s.age, ':');
        getline(is, s.id);

        return is;
}

int main(int argc, char* argv[])
{
        ifstream file("file.txt");
        vector<user> vec;
        copy(istream_iterator<user>(file), istream_iterator<user>(), back_inserter(vec));

        return 0;
}

My custom operator>> is called twice but I would expect it to be called only once because the contents are:

John:forty:21-5821-0

flumpb
  • 1,707
  • 3
  • 16
  • 33
  • 1
    How do you know it's called twice? Checked in debugger? You get two entries in the vector? If the last, are both entries the same? – Some programmer dude Mar 30 '12 at 22:49
  • 1
    +1, had the same problem recently … for some reason, the iterator increment in the `copy` code was causing the read, rather than the dereferencing, so it performs one read too many. That said, your `operator >>` needs to check the status after the first two `getline` operations! – Konrad Rudolph Mar 30 '12 at 22:50
  • It's pointless to worry about I/O code that does not check return values. You must *always* check the return values of I/O operations. – Kerrek SB Mar 30 '12 at 23:00
  • This seems so weird to me. vec is of size 1, but operator>> is called twice. Thanks for the comments. – flumpb Mar 30 '12 at 23:15
  • 4
    @kisplit: The second time it's called is so it knows it reached the end of the file. How can it know it read the entire file unless it reads until it fails to read? The second time it fails, so it knows there's no more data. – Mooing Duck Mar 30 '12 at 23:17
  • @MooingDuck Ahhhh that makes sense. So when operator>> returns, copy checks the status of istream and if good it vec.push_back on the element? – flumpb Mar 30 '12 at 23:21
  • @MooingDuck actually not only is it not needed but it won't work, since `&&` will turn them into a `bool`, no? – Seth Carnegie Mar 30 '12 at 23:51

1 Answers1

3

In general, to read in an entire file, you read until a read fails. Then you know either something went wrong, or you got them all. Either way, you can't know you reached the end of the file until you fail to read. Since the first one succeeds, it must try a second time, to find out if there is a second element. The psudocode for this is

while(in_stream >> object) {
   myvector.push_back(object);
}

Also note that this is the "idiomatic" way to read in an entire file of values. If you're checking for eof, fail, or bad, you're code is probably wrong.

That said, your istream& operator>>(istream& is, user& s) function is just fine. The second time it is called, the first getline will fail, setting the stream to a bad state (eof), the next two getlines will fail as well, and it will return the stream, and everything works perfectly. Just remember that any or all of those variables may hold complete nonsense, since the read failed.

Mooing Duck
  • 64,318
  • 19
  • 100
  • 158