I am trying to create a data structure for arrays of dynamic-sized arrays. Multiple choices are possible, the simplest one being std::vector<std::vector<T>>
. However it is often not efficient and we would like to compress the data of all the inner vectors into one big vector
, and have a vector
of offsets to tell where each element begins.
Example:
// encoding of : | 4.,5.,1. | 7.,8.,9.,2 |
std::vector<double> v = {4.,5.,1., 7.,8.,9.,2};
std::vector<int> offsets = {0 , 3 , 7};
Let's encapsulate it ! Consider the following data structure:
(note: the code is neither complete, general or precise, at this point this is just to give an idea of what is going on):
class vblock_vector {
private:
std::vector<double> v;
std::vector<int> offsets;
public:
using iterator = vblock_iterator;
auto begin() -> iterator {
return {v.data(),offsets.data()};
}
auto end() -> iterator {
return {v.data(),offsets.data()+offsets.size()};
}
};
An basic implementation of the iterator type is the following:
struct vblock_iterator {
private:
double* ptr;
int* offsets_ptr;
public:
using reference = span_ref<double>; // see notes (0) and (1)
// using value_type = ???; // See below
auto operator++() {
++offsets_ptr;
return *this;
}
auto operator*() const {
return span_ref<double,int>(ptr+offsets_ptr[0],ptr+offsets_ptr[1]);
}
auto operator<=>(const vblock_iterator&) const = default;
// ... other iterator interface stuff that is trivial
};
This iterator works with e.g. std::copy
. (4)
Now let's say that I want to replace my old std::copy
calls with std::ranges::copy
. For that, vblock_iterator
needs to satisfy the std::input_iterator
concept. In order to do that, vblock_iterator
needs to have an associated value_type
(required by the intermediate std::indirectly_readable
concept).
An obvious choice would be using value_type = std::vector<double>
(2), but I surely don't want to give std::ranges::copy
the freedom to use this type at its discretion in its implementation: it would be inefficient.
My question is the following : why does std::input_iterator<In>
requires In
to have a value_type
? At least for copying it is not needed (the fact that I can use std::copy
and that it does the right thing proves it). Of course, one can say : "define value_type
to be anything, it won't be used by std::range::copy
implementations anyway", but then why require it?
I am currently under the impression that value_type
is mandatory for e.g. std::swappable
, but not for std::input_iterator
(nor even std::random_access_iterator
dare I say). But the standard committee decided otherwise: what is the reason behind this choice? (3)
Notes:
(0) span_ref
is just like a std::span
with reference semantics (its operator=
is "assign-through" and not "rebind to new array").
(1) In reality, the reference type needs to be a tad more complex to account for offsets, but it is not the subject here. Suffice to say, that it is possible to have an efficient reference type for this structure.
(2) And I think this is the only reasonable choice. At least a container is needed (vector
, deque
...). E.g. a std::span
won't do because if we bother to save the value pointed to by the iterator, it is because we will modify the original memory, and std::span
won't help us with that.
(3) In the presentation of the std::indirectly_readable
concept (then called Readable
), Eric Niebler goes into some detail of why we need value_type
to be related in some form to reference
to work well with proxy references, but I still don't see why we would would even need value_type
for algorithms that don't need to swap elements (or store them somewhere). Yes, there is mathematically a value_type
for vblock_iterator
, but why require it if it is not meant to be used? (similarly, there is also mathematical operator+=
for forward ranges : but since it is inefficient, it is simply not required).
(4) And other algorithms: std::move
, std::find
, std::find_if
, std::any_of
, std::partition_point
, std::lower_bound
, std::unique
... So I think that there is something more fundamental going on than: "we are just lucky with std::copy
".