Is there an idiomatic, efficient C++ equivalent to Haskell's groupBy?

Question

I'm trying to process an input sequence with Boost.Range. The library leaves quite a lot to be desired, so I have to write some additional range adaptors on my own. Most of them are straightforward, but I ran into some difficulties when I tried to implement an equivalent of Haskell's groupBy (or ranges-v3's group_by_view). It's a transformation that takes an input range and returns a range of ranges, each containing a sequence of adjacent elements from the input that satisfy some given binary predicate. For example, if the binary predicate is simply std::equal_to<int>(), the sequence

{1, 1, 2, 3, 5, 5, 5, 4, 1}

would be mapped to

{{1, 1}, {2}, {3}, {5, 5, 5}, {4}, {1}}

My problem is with the interface for this adaptor. Suppose

auto i = (input | grouped_by(std::equal_to<int>())).begin();

if i is incremented, it would have to scan the underlying sequence until it finds 2. If, however, I first scan *i (which is the range {1, 1}), I essentially already found the end of the first group, so the traversal caused by ++i would be redundant. It's possible to have some feedback path from the inner iterator to the outer one, i.e. have i start the scan from the last element reached by the inner iterator, but that would cause a lot of overhead, and risk creating dangling iterators.

I'm wondering if there is some idiomatic way to deal with this problem. Ideally some redefinition of grouped_by interface that sidesteps the problem altogether. Obviously the input range has to be scanned to find the beginning of each group, but I'd like to have a robust way to do that without rescanning elements for no reason. (By robust I mean not invalidating iterators as long as the underlying input range's iterators are valid, and certainly not during the scan itself.)

So.. is there some known/proven/elegant solution to this?

Not really, it doesn't provide lazy evaluation, so after calling it I would have to iterate over the range again do further processing (the integers here are just an example, in reality I have more complex values with non-trivial operations on them). Unless I'm missing something? — Yaron Tausky, Jul 20 '16 at 21:31
Why the downvotes? It's a clearly stated, well researched question. — Jesper Juhl, Jul 20 '16 at 21:33
Equal_range can be used to build a sequence of ranges. From there you can loop through and do whatever processing you like. Equal_range can take a predicate, so it doesn't have to test for literal equality — Richard Hodges, Jul 20 '16 at 21:33
But that's exactly what I don't want to do -- I want to combine the initial segmentation and the processing into a single loop, instead of first creating the sequence of ranges and then iterating over it and processing each one. I just want to do it with the nice abstractions that lazy evaluation provides, instead of handwriting such a loop. — Yaron Tausky, Jul 20 '16 at 22:01

Is there an idiomatic, efficient C++ equivalent to Haskell's groupBy?

0 Answers0