1

In C++ when you use getline() with delimiter on stringstream there are few things that I didn't found documented, but they have some non-error handy behaviour when:

  • delimiter is not found => then simply whole string/rest of it is returned
  • there is delimiter but nothing before it => empty string is returned
  • getting something that isn't really there => returns the last thing that could be read with it

Some test code (simplified):

#include <iostream>
#include <string>
#include <sstream>
using namespace std;

string test(const string &s, char delim, int parseIndex ){
    stringstream ss(s);
    string parsedStr = "";
    
    for( int i = 0; i < (parseIndex+1); i++ ) getline(ss, parsedStr, delim);
    
    return parsedStr;
}

int main() {
    stringstream ss("something without delimiter");
    string s1;
    getline(ss,s1,';');
    cout << "'" << s1  << "'" << endl; //no delim
    cout << endl;
    
    string s2 = "321;;123";
    cout << "'" << test(s2,';',0) << "'" << endl; //classic
    cout << "'" << test(s2,';',1) << "'" << endl; //nothing before
    cout << "'" << test(s2,';',2) << "'" << endl; //no delim at the end
    cout << "'" << test(s2,';',3) << "'" << endl; //this shouldn't be there
    cout << endl;
    
    return 0;
}

Test code output:

'something without delimiter'

'321'
''
'123'
'123'

Test code fiddle: http://ideone.com/ZAuydR

The Question

The question is - can this be relied on? If so, where is it documented - is it?

Thanks for answers and clarifying :)

Community
  • 1
  • 1
jave.web
  • 13,880
  • 12
  • 91
  • 125
  • 3
    What documentation *did* you find? Yes, all the behavior is completely documented in the C++ standard. – Potatoswatter Aug 30 '15 at 15:39
  • It's documented approximately the same [here](http://en.cppreference.com/w/cpp/string/basic_string/getline), too. – chris Aug 30 '15 at 15:52

2 Answers2

2

The behavior of getline is explicitly documented in the standard (C++11 §21.4.8.9 ¶7-10), which is the only normative document about C++.

The behavior your asked about in the first two questions is guaranteed, while the third one is a consequence of how your test rig is made.

template<class charT, class traits, class Allocator>
  basic_istream<charT,traits>&
    getline(basic_istream<charT,traits>& is,
            basic_string<charT,traits,Allocator>& str,
            charT delim);
template<class charT, class traits, class Allocator>
   basic_istream<charT,traits>&
   getline(basic_istream<charT,traits>&& is,
           basic_string<charT,traits,Allocator>& str,
           charT delim);

Effects: Behaves as an unformatted input function (27.7.2.3), except that it does not affect the value returned by subsequent calls to basic_istream<>::gcount(). After constructing a sentry object, if the sentry converts to true, calls str.erase() and then extracts characters from is and appends them to str as if by calling str.append(1, c) until any of the following occurs:

  • end-of-file occurs on the input sequence (in which case, the getline function calls is.setstate(ios_base::eofbit)).
  • traits::eq(c, delim) for the next available input character c (in which case, c is extracted but not appended) (27.5.5.4)
  • str.max_size() characters are stored (in which case, the function calls is.setstate(ios_base::failbit)) (27.5.5.4)

The conditions are tested in the order shown. In any case, after the last character is extracted, the sentry object k is destroyed.

If the function extracts no characters, it calls is.setstate(ios_base::failbit) which may throw ios_base::failure (27.5.5.4).

Returns: is.

Coming to your questions:

delimiter is not found => then simply whole string/rest of it is returned

That's a consequence of the first exit condition - when the input string terminates the string stream goes in end-of-file, so the extraction terminates (after having added all the preceding characters to the output string).

there is delimiter but nothing before it => empty string is returned

That's just a special case of the second point - the extraction terminates when the delimiter is found (traits::eq(c, delim) normally boils down to c==delim), even if no other character has been extracted before.

getting something that isn't really there => returns the last thing that could be read with it

It doesn't go exactly like this. If the stream is in an error condition (the sentry object does not convert to true, in the description above) - in your case you have an EOF -, getline leaves your string alone and returns. In your test code you see the last read data just because you are recycling the same string without clearing it between the various tests.

Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • Ok - to 3rd - so it constructs sentry object that finds EOF so it converts to false and str.erease() is not called? (so the string remains the same) – jave.web Aug 30 '15 at 16:06
  • 1
    @jave.web: mostly - the `sentry` object is there also to do several other things to the stream, but the point is that if the stream is in an error condition it converts to `false`, and `getline` knows it has to quit immediately. – Matteo Italia Aug 30 '15 at 16:12
1

The behavior of C++ facilities is described by the ISO C++ standard. But, it's not the most readable resource. In this case, cppreference.com has good coverage.

Here's what they have to say. The quote blocks are copy-pasted; I've interspersed explanations to your questions.

Behaves as UnformattedInputFunction, except that input.gcount() is not affected. After constructing and checking the sentry object, performs the following:

"Constructing and checking the sentry" means that if an error condition has been detected on the stream, the function will return without doing anything. This is why in #3 you observe the last valid input when "nothing should be there."

1) Calls str.erase()

So, if nothing is subsequently found before the delimiter, you'll get an empty string.

2) Extracts characters from input and appends them to str until one of the following occurs (checked in the order listed)

a) end-of-file condition on input, in which case, getline sets eofbit.

This is an error condition which causes the string local variable to be unchanged by subsequent getlines.

It also allows you to observe the last segment of input before the end, so you may treat the end-of-file as a delimiter if you wish.

b) the next available input character is delim, as tested by Traits::eq(c, delim), in which case the delimiter character is extracted from input, but is not appended to str.

c) str.max_size() characters have been stored, in which case getline sets failbit and returns.

3) If no characters were extracted for whatever reason (not even the discarded delimiter), getline sets failbit and returns.

Potatoswatter
  • 134,909
  • 25
  • 265
  • 421