I'm trying to process an input sequence with Boost.Range. The library leaves quite a lot to be desired, so I have to write some additional range adaptors on my own. Most of them are straightforward, but I ran into some difficulties when I tried to implement an equivalent of Haskell's groupBy (or ranges-v3's group_by_view). It's a transformation that takes an input range and returns a range of ranges, each containing a sequence of adjacent elements from the input that satisfy some given binary predicate. For example, if the binary predicate is simply std::equal_to<int>()
, the sequence
{1, 1, 2, 3, 5, 5, 5, 4, 1}
would be mapped to
{{1, 1}, {2}, {3}, {5, 5, 5}, {4}, {1}}
My problem is with the interface for this adaptor. Suppose
auto i = (input | grouped_by(std::equal_to<int>())).begin();
if i
is incremented, it would have to scan the underlying sequence until it finds 2. If, however, I first scan *i
(which is the range {1, 1}
), I essentially already found the end of the first group, so the traversal caused by ++i
would be redundant. It's possible to have some feedback path from the inner iterator to the outer one, i.e. have i
start the scan from the last element reached by the inner iterator, but that would cause a lot of overhead, and risk creating dangling iterators.
I'm wondering if there is some idiomatic way to deal with this problem. Ideally some redefinition of grouped_by
interface that sidesteps the problem altogether. Obviously the input range has to be scanned to find the beginning of each group, but I'd like to have a robust way to do that without rescanning elements for no reason. (By robust I mean not invalidating iterators as long as the underlying input range's iterators are valid, and certainly not during the scan itself.)
So.. is there some known/proven/elegant solution to this?