1

I've been going through Stroustrup's Programming and Principles to teach myself c++11. In chapter 11, he describes a program that removes (turns into whitespace) any un-wanted characters from an input stream. So, for example, I could set a string to hold the characters '!' and '.'. And then I could input dog! food and receive the output dog food .

However, I'm not understanding how the string, word in main

 int main ()
    {

    Punct_stream ps {cin};
    ps.whitespace(";:,.?!()\"{}<>/&$@#%^*|~");
    ps.case_sensitive(false);

    cout<<"Please input words."<<"\n";
    vector<string> vs;
    for (string word; ps>>word;)// how does word get assigned a string? {
        vs.push_back(word);

    }

    sort(vs.begin(), vs.end());

    for (int i = 0; i<vs.size(); ++i) {
       if (i==0 || vs[i]!=vs[i-1]) cout<<vs[i]<<"\n";
     }


     }

is assigned a value through the overloaded definition of >>.

Punct_stream& Punct_stream::operator>>(string& s)
{
  while (!(buffer>>s)) {
    if (buffer.bad() || !source.good()) return *this;
    buffer.clear();


    string line;
    getline(source,line); // get a line from source

    for (char& ch : line)
        if (is_whitespace(ch))
            ch = ' ';
        else if (!sensitive)
            ch = tolower(ch);
    buffer.str(line); //how does word become this value?


   }

   return *this;
   } 

Obviously, pointer this will be the result of >>, but I don't understand how that result includes assigning word the string of istringstream buffer. I only know the basics of pointers, so maybe that's my problem?

#include<iostream>
#include<sstream>
#include<string>
#include<vector>

using namespace std;

class Punct_stream {
public:
   Punct_stream(istream& is)
    : source{is}, sensitive{true} { }
   void whitespace(const string& s) { white = s; }
   void add_white(char c) { white += c; }
   bool is_whitespace(char c);
   void case_sensitive(bool b) { sensitive = b; }
   bool is_case_sensitive() { return sensitive; }
   Punct_stream& operator>>(string& s);

   operator bool();
private:
  istream& source;
  istringstream buffer;
  string white;
  bool sensitive;
};

Punct_stream& Punct_stream::operator>>(string& s)
{
  while (!(buffer>>s)) {
    if (buffer.bad() || !source.good()) return *this;
    buffer.clear();


    string line;
    getline(source,line); // get a line from source

    for (char& ch : line)
        if (is_whitespace(ch))
            ch = ' ';
        else if (!sensitive)
            ch = tolower(ch);
    buffer.str(line); //how does word become this value?


   }

     return *this;
   }

  Punct_stream::operator bool()
   {
      return !(source.fail() || source.bad()) && source.good(); }

  bool Punct_stream::is_whitespace(char c) {
      for (char w : white)
         if (c==w) return true;           return false;
    }

 int main ()
    {

    Punct_stream ps {cin};
    ps.whitespace(";:,.?!()\"{}<>/&$@#%^*|~");
    ps.case_sensitive(false);

    cout<<"Please input words."<<"\n";
    vector<string> vs;
    for (string word; ps>>word;)// how does word get assigned a string? {
        vs.push_back(word);

    }

    sort(vs.begin(), vs.end());

    for (int i = 0; i<vs.size(); ++i) {
       if (i==0 || vs[i]!=vs[i-1]) cout<<vs[i]<<"\n";
     }


     }
  • The value is assigned to your out parameter `s` within statement `buffer>>s` in `operator >>`. `return *this;` is done, because the author wanted to allow read chaining, such as: `ps >> word1 >> word2 >> word3`. Without returning, you would be forced to use one read, per one statement. – Algirdas Preidžius Feb 24 '16 at 11:45

1 Answers1

3

The trick is that the while loop inside operator >> has opposite logic to what you normally do when reading from a stream. Normally, you'd do something like this (and main does it, in fact):

while (stream >> aString)

Notice, however, that the while in the extractor has a negation:

Try extracting s from buffer. If you fail, do one iteration of the loop and try again.

At start, buffer is empty so extracting s will fail and the loop body will be entered. What the loop body does is read a line from source (the stream being wrapped), transform selected characters of that line into whitespace, and set this line as the content of buffer (via the buffer.str(line); call).

So, after the line was transformed, it is queued into buffer. Then the next iteration of the loop comes, and it again tries to extract s from buffer. If the line had any non-whitespace, the first word will be extracted (and the rest will remain in buffer for further readings). If the line had whitespace only, the loop body is entered again.

Once s is successfully extracted, the loop terminates and the function exits.

On next call, it will work with whatever was left in buffer, re-filling buffer from source as necessary (by the process I've explained above).

Angew is no longer proud of SO
  • 167,307
  • 17
  • 350
  • 455
  • [This](http://stackoverflow.com/a/28710080/3410396) answers another OP question: what does `return *this` means. – Revolver_Ocelot Feb 24 '16 at 11:48
  • Lol, thanks. I got so caught up in trying to figure out what *this did that, when I tried to read through the code, I didn't pay enough attention to the while loop. – Benton Girdler Feb 24 '16 at 12:04