2

yes again, because I recently asked a very similar question (how to read a comma-separated list of integers), but this time I'm stuck on reading lines of strings that consists of comma-separated data. It sure must be trivial to convert my previous code that handled integers to instead handle strings of data chunks instead, right?

Ok, so I read data from a file or stdin that has many lines containing words that are separated by commas, for example:

hello,this,is,firstrow,sdf763  
this,is,2nd,row  
and,so,on314  

So, my idea is simply to read lines of data from an istream using ranges::getlines (or ranges::istream_view), pipe each line to the split view adaptor splitting on commas in order to get the words (as a range of ranges, which I then join) and finally transform/decode each word that is then put into a vector. IMHO it should be super simple, just like:

std::string decode(const std::string& word);

int main()
{
    using namespace ranges;
    auto lines = getlines(std::cin);           // ["hello,this,is,firstrow,sdf763" "this,is,2nd,row" "and,so,on314" ...]
    auto words = lines | view::split(",");     // [["hello" "this" "is" "firstrow" "sdf763"] ["this" "is" "2nd" "row"] [...]]
    auto words_flattened = words | view::join; // ["hello" "this" "is" "firstrow" "sdf763" "this" "is" "2nd" "row" ...]
    auto decoded_words = words_flattened | view::transform([](const auto& word){
        return decode(word);
    }) | to_vector;

    for (auto word : decoded_words) {
        std::cout << word << "\n";
    }
    std::cout << std::endl;
}

But no, this does not work and I cannot figure out why! The split view adaptor seem to not split the lines at all because the whole line is passed as an argument to transform - why is that?? I'm obviously still learning ranges and still miss some basic concepts it seems... I sure would appreciate if someone could explain what is going on, thanks in advance!

The link to my previous SO-question: Using range-v3 to read comma separated list of numbers

Barry
  • 286,269
  • 29
  • 621
  • 977
bamse
  • 55
  • 5

1 Answers1

5

The split view adaptor seem to not split the lines at all because the whole line is passed as an argument to transform - why is that??

Because that's exactly what you're accidentally asking for.

split is an adapter that takes a range of T and yields a range of range of T, being split on a delimiter that is either a single T or itself a range of Ts.

When you write:

lines | views::split(",");

lines is a range of strings (not a single string) and you're asking to split that range of strings by the string that is a single comma. What that would do is if you had a range of strings like ["A", ",", "B", "C", "D", ",", "E"] (that is, 7 strings of which the 2nd and 6th are commas) you would get back [["A"], ["B", "C", "D"], ["E"]].

But that's not what you want.

What you want is to split each string on a comma. That's:

lines | views::transform([](auto const& s) { return s | views::split(','); })

This takes your RangeOf<string> and turns it into a RangeOf<RangeOf<RangeOf<char>>> (this only adds one layer of range-ness... since string is a RangeOf<char>. But we lose string-ness).

You can then join those together:

lines | views::transform([](auto const& s) { return s | views::split(','); })
      | views::join;

And now we're back to a RangeOf<RangeOf<char>>. If what we actually want is a RangeOf<string>, we need to collect each element back into one:

lines | views::transform([](auto const& s) { return s | views::split(','); })
      | views::join
      | views::transform([](auto const& rc) { return rc | to<std::string>; });

Alternatively, you can move the second transform inside of the first so that you collect into strings before you join.

Barry
  • 286,269
  • 29
  • 621
  • 977
  • It would indeed be oddly specific if `views::split` were a *string*-specific operation. – Davis Herring Dec 29 '19 at 17:12
  • >When you write: >`lines | views::split(",");` >lines is a range of strings (not a single string) and you're asking to >split that range of strings by the string that is a single comma. Yes, that was my main misconception. >What you want is to split each string on a comma. That's: >`lines | views::transform([](auto const& s) { return s | views::split(','); })` Yes, that makes perfect sense now that you explained it to me! :-) Thx for your great explanation @Barry – bamse Dec 29 '19 at 20:35
  • Is there an obvious reason why we can't replace `views::transform([](auto const& rc) { return rc | to; })` with `views::transform(to)`? – Ernest_Galbrun Dec 04 '21 at 06:16