4

Somehow, the native stl::copy() algorithm on VC++ (Dinkumware) figures out that it can use memcpy() on data that is trivially copy-able. Is it possible for a mere mortal to do that? - assuming each element is_trivially_copyable.

Does random_access_iterator imply contiguous memory? The standard is not clear to me.

So, if all you have in a template is an iterator or two, is it possible to deduce at compile-time that the underlying array can be copied with memcpy(), and if so how?

EDIT - Here's my motivation. I have a routine that moves a block of data rather than copying it. To get speed when the data are memmove-able, I've made it call stl::copy sometimes. I was wondering if that is the only way. (As a kid, I would always try this at home.)

// Move a range of values
template<class Ptr, class Ptr2>
inline Ptr2 move(Ptr src, Ptr end, Ptr2 dest) {
    using value_type = std::iterator_traits<Ptr>::value_type;
    if constexpr (std::is_trivially_copyable_v<value_type>) {
        return std::copy(src, end, dest);
    } else {
        while (src != end) {
            *dest = std::move(*src);
            ++src; ++dest;
        }
    }
    return dest;
}

EDIT: Kudos to zett42 for finding this related question: Contiguous iterator detection I can't explain how I missed it.

EDIT More: After going down many twisty little passages all the same, I discovered that Dinkum uses secret tags for iterators that are in with the in-crowd, e.g. _Really_trivial_ptr_iterator_tag. So prospects look dim.

My $0.02 worth: It was a beginner's mistake to make iterator_category an ad-hoc type rather than busting out the various traits, like "points_into_contiguous_memory," etc... Since random_access_iterator is an ad-hoc type denoted only with a tag, it cannot be subtyped without breaking legacy applications. So the committee is now kind of stuck. Time to start over, I say.

Oh well.

max66
  • 65,235
  • 10
  • 71
  • 111
Jive Dadson
  • 16,680
  • 9
  • 52
  • 65
  • The iterators of std::vector are both random access and represent contiguous elements. This is by definition. Note that a call to memmove isn't necessarily the fastest copy. If you compile std::copy with -O2 and -march=native on gcc, you get even better code. – Richard Hodges Dec 04 '17 at 21:29
  • There are just a few [contiguous iterator](http://en.cppreference.com/w/cpp/concept/ContiguousIterator) types. The compiler knows them well and will be able to recognize when one is used and do the optimizations needed. – Some programmer dude Dec 04 '17 at 21:31
  • Suppose you do not know that the iterators are std::vector. – Jive Dadson Dec 04 '17 at 21:31
  • You and your program might not know, but the compiler will know. – Some programmer dude Dec 04 '17 at 21:35
  • Forget the compiler. Suppose the array is user-defined. Is there a way? – Jive Dadson Dec 04 '17 at 21:36
  • What do you mean by "user-defined" array? If you make a class similar to `std::vector` or `std::array`? Then no it's not possible, and the optimizations you see for e.g. `std::vector` are not possible. – Some programmer dude Dec 04 '17 at 21:41
  • What a bummer. I didn't know stl cheated like that. How do third parties like STL-port manage? – Jive Dadson Dec 04 '17 at 21:43
  • 1
    From a quick look of MSVC's implementation of `std::copy()` there is no compiler magic to decide if `memmove` can be used. Contiguous iterators are detected by template metaprogramming. But this is non-standardized stuff, see also [Contiguous iterator detection](https://stackoverflow.com/questions/42851957/contiguous-iterator-detection). – zett42 Dec 04 '17 at 21:57
  • @Someprogrammerdude "_There are just a few contiguous iterator types. The compiler knows them well_" - Are you just guessing or have you actually checked your std lib implementation? – zett42 Dec 04 '17 at 21:59
  • 1
    There might be internal and implementation-specific types, traits and tags available that algorithm functions can check. Unfortunately [`std::iterator_traits`](http://en.cppreference.com/w/cpp/iterator/iterator_traits) doesn't have a tag for contiguous iterators. – Some programmer dude Dec 05 '17 at 07:02

2 Answers2

2

Does random_access_iterator imply contiguous memory?

Short Answer: No.

Long Answer: depends on the type.

Example: for std::vector (with random access iterator) it's granted that the memory is all values are in a contiguous block of memory that can be accessed from data() method.

For std::deque (also with random access iterator) you know that the memory area is divided in chunks (std::deque is designed do make efficient insert/remove of element in middle) so it's improbable that the memory is contiguous.

max66
  • 65,235
  • 10
  • 71
  • 111
  • What a bummer. So, getting back to the question, if only certain stl containers are guaranteed to be contiguous, how to specialize for them, given only iterators? – Jive Dadson Dec 04 '17 at 21:38
  • 1
    @JiveDadson - you can study the standard, detect the few containers that are contiguous granted and create a custom type traits that return a compile-time true value (or false value, according the container) and decide using this type traits (not a great solution; I know; but is the best that come in my mind) – max66 Dec 04 '17 at 21:59
  • The template only sees iterators, not containers. Iterators are secretive about the type of container they point into. Perhaps they don't even know. – Jive Dadson Dec 04 '17 at 22:16
  • @JiveDadson - I see... I think that, unfortunately, you're right: starting from iterators there isn't a way to detect the corresponding container; so I don't see a solution. – max66 Dec 04 '17 at 22:39
  • And yet, somehow DInkumware does, for pointers into stl::vector, at least. – Jive Dadson Dec 04 '17 at 22:56
1

A) No. It just means that a valid mapping of the kind F(iterator +/- n) = iterator.next/prev.....n exists. This does not imply contiguous allocation at all. ( bounds do apply)

B) No. It depends on the implementation. For instance, you might not know what kind of data structure might be received if there is a 2 level of indirection.

The good news?, you need not bother at all. Between the cache and the branch prediction that happens, you would not need to optimize it at all. During run time, The cache line will be filled with the block of contiguous memory you intend to copy, thus it is going to be very fast, and memmove or memcpy will not help much.

You do not guarantee much with modern processors that are going to pipeline your instructions at run time, and they would know if it is contiguous or not.

Pranay
  • 393
  • 4
  • 14
  • 3
    The compiler may generate acceptable code that the CPU can run reasonably, but one can't get close to the limits of memory bandwidth attainable even by a single thread with that code. Have a look at the assembly code of C libraries' implementations of `memcpy` to understand to just what lengths they'll go to overlap multiple operations and pack bits in and out of core efficiently. – Phil Miller Dec 11 '17 at 20:14