99

I'm using a external library which at some point gives me a raw pointer to an array of integers and a size.

Now I'd like to use std::vector to access and modify these values in place, rather than accessing them with raw pointers.

Here is an articifial example that explains the point:

size_t size = 0;
int * data = get_data_from_library(size);   // raw data from library {5,3,2,1,4}, size gets filled in

std::vector<int> v = ????;                  // pseudo vector to be used to access the raw data

std::sort(v.begin(), v.end());              // sort raw data in place

for (int i = 0; i < 5; i++)
{
  std::cout << data[i] << "\n";             // display sorted raw data 
}

Expected output:

1
2
3
4
5

The reason is that I need to apply algorithms from <algorithm> (sorting, swaping elements etc.) on that data.

On the other hand changing the size of that vector would never be changed, so push_back, erase, insert are not required to work on that vector.

I could construct a vector based on the data from the library, use modify that vector and copying the data back to to library, but that would be two complete copies that I'd like to avoid as the data set could be really big.

NutCracker
  • 11,485
  • 4
  • 44
  • 68
Jabberwocky
  • 48,281
  • 17
  • 65
  • 115

11 Answers11

91

C++20's std::span

If you are able to use C++20, you could use std::span which is a pointer - length pair that gives the user a view into a contiguous sequence of elements. It is some sort of a std::string_view, and while both std::span and std::string_view are non-owning views, std::string_view is a read-only view.

From the docs:

The class template span describes an object that can refer to a contiguous sequence of objects with the first element of the sequence at position zero. A span can either have a static extent, in which case the number of elements in the sequence is known and encoded in the type, or a dynamic extent.

So the following would work:

#include <span>
#include <iostream>
#include <algorithm>

int main() {
    int data[] = { 5, 3, 2, 1, 4 };
    std::span<int> s{data, 5};

    std::sort(s.begin(), s.end());

    for (auto const i : s) {
        std::cout << i << "\n";
    }

    return 0;
}

Check it out live

Since std::span is basically pointer - length pair, you can use in a following manner too:

size_t size = 0;
int *data = get_data_from_library(size);
std::span<int> s{data, size};

Note: Not all compilers support std::span. Check compiler support here.

UPDATE

If you are not able to use C++20, you could use gsl::span which is basically the base version of the C++ standard's std::span.

C++11 solution

If you are limited to C++11 standard, you can try implementing your own simple span class:

template<typename T>
class span {
   T* ptr_;
   std::size_t len_;

public:
    span(T* ptr, std::size_t len) noexcept
        : ptr_{ptr}, len_{len}
    {}

    T& operator[](int i) noexcept {
        return *ptr_[i];
    }

    T const& operator[](int i) const noexcept {
        return *ptr_[i];
    }

    std::size_t size() const noexcept {
        return len_;
    }

    T* begin() noexcept {
        return ptr_;
    }

    T* end() noexcept {
        return ptr_ + len_;
    }
};

Check out C++11 version live

NutCracker
  • 11,485
  • 4
  • 44
  • 68
  • 5
    You could use [`gsl::span`](https://github.com/microsoft/GSL) for C++14 and above if your compiler doesn't implement `std::span` – Artyer Feb 10 '20 at 14:31
  • @NutCracker Should not the boundary be checked in the implementation of operator[]? – John Jun 23 '21 at 00:58
  • @John if you take a look at spec for [`std::span::operator[]`](https://en.cppreference.com/w/cpp/container/span/operator_at) no bounds checking is performed in case of `operator[]` – NutCracker Jun 23 '21 at 06:10
  • @NutCracker What a surprise! **No** check, **no** exception. Is it right to do so? – John Jun 23 '21 at 06:15
  • @John yes because some developers don't prefer using exceptions in their code (for performance reasons). If you want to perform bounds checking, in case of `std::vector` you can use [`std::vector::at`](https://en.cppreference.com/w/cpp/container/vector/at) – NutCracker Jun 23 '21 at 06:18
  • @NutCracker In general, ***who*** is responsible for such checking? The developers of the standard library or the users of the library? – John Jun 23 '21 at 06:22
  • @John IMHO, this is the responsibility of a developer, to do some checks before indexing. – NutCracker Jun 23 '21 at 06:24
  • @NutCracker You mean developers of the standard library? – John Jun 23 '21 at 06:30
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/234097/discussion-between-nutcracker-and-john). – NutCracker Jun 23 '21 at 06:30
66

The problem is that std::vector has to make a copy of the elements from the array you initialize it with as it has the ownership of the objects it contains.

To avoid this, you can use a slice object for an array (i.e., similar to what std::string_view is to std::string). You could write your own array_view class template implementation whose instances are constructed by taking a raw pointer to an array's first element and the array length:

#include <cstdint>

template<typename T>
class array_view {
   T* ptr_;
   std::size_t len_;
public:
   array_view(T* ptr, std::size_t len) noexcept: ptr_{ptr}, len_{len} {}

   T& operator[](int i) noexcept { return ptr_[i]; }
   T const& operator[](int i) const noexcept { return ptr_[i]; }
   auto size() const noexcept { return len_; }

   auto begin() noexcept { return ptr_; }
   auto end() noexcept { return ptr_ + len_; }
};

array_view doesn't store an array; it just holds a pointer to the beginning of the array and the length of that array. Therefore, array_view objects are cheap to construct and to copy.

Since array_view provides the begin() and end() member functions, you can use the standard library algorithms (e.g., std::sort, std::find, std::lower_bound, etc.) on it:

#define LEN 5

auto main() -> int {
   int arr[LEN] = {4, 5, 1, 2, 3};

   array_view<int> av(arr, LEN);

   std::sort(av.begin(), av.end());

   for (auto const& val: av)
      std::cout << val << ' ';
   std::cout << '\n';
}

Output:

1 2 3 4 5

Use std::span (or gsl::span) instead

The implementation above exposes the concept behind slice objects. However, since C++20 you can directly use std::span instead. In any case, you can use gsl::span since C++14.

Useless
  • 64,155
  • 6
  • 88
  • 132
JFMR
  • 23,265
  • 4
  • 52
  • 76
  • Why did you mark the methods as noexcept? You cannot guarantee at all that no exception is being thrown, can you? – SonneXo Feb 11 '20 at 06:37
  • 10
    @SonneXo https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Re-noexcept – moooeeeep Feb 11 '20 at 08:10
  • 2
    @moooeeeep It is better to leave some explanation than just a link. The link could be expired in the future while I've seen this happened a lot. – Jason Liu Mar 06 '20 at 23:52
29

Since the algorithm-library works with iterators you can keep the array.

For pointers and known array length

Here you can use raw pointers as iterators. They support all the opertations an iterator supports (increment, comparison for equality, value of, etc...):

#include <iostream>
#include <algorithm>

int *get_data_from_library(int &size) {
    static int data[] = {5,3,2,1,4}; 

    size = 5;

    return data;
}


int main()
{
    int size;
    int *data = get_data_from_library(size);

    std::sort(data, data + size);

    for (int i = 0; i < size; i++)
    {
        std::cout << data[i] << "\n";
    }
}

data points to the dirst array member like an iterator returned by begin() and data + size points to the element after the last element of the array like an iterator returned by end().

For arrays

Here you can use std::begin() and std::end()

#include <iostream>
#include <algorithm>

int main()
{
    int data[] = {5,3,2,1,4};         // raw data from library

    std::sort(std::begin(data), std::end(data));    // sort raw data in place

    for (int i = 0; i < 5; i++)
    {
        std::cout << data[i] << "\n";   // display sorted raw data 
    }
}

But keep in mind that this only works, if data does not decay to a pointer, because then length information goes missing.

Community
  • 1
  • 1
Lukas-T
  • 11,133
  • 3
  • 20
  • 30
  • 8
    This is the right answer. Algorithms apply to **ranges**. Containers (e.g., std::vector) are one way of managing ranges, but they are not the only way. – Pete Becker Feb 11 '20 at 15:03
14

You can get iterators on raw arrays and use them in algorithms:

    int data[] = {5,3,2,1,4};
    std::sort(std::begin(data), std::end(data));
    for (auto i : data) {
        std::cout << i << std::endl;
    }

If you are working with raw pointers (ptr + size), then you can use the following technique:

    size_t size = 0;
    int * data = get_data_from_library(size);
    auto b = data;
    auto e = b + size;
    std::sort(b, e);
    for (auto it = b; it != e; ++it) {
        cout << *it << endl;
    }

UPD: However, the above example is of bad design. The library returns us a raw pointer and we don't know where the underlying buffer is allocated and who is supposed to free it.

Usually, the caller provides a buffered for the function to fill the data. In that case, we can preallocate the vector and use its underlying buffer:

    std::vector<int> v;
    v.resize(256); // allocate a buffer for 256 integers
    size_t size = get_data_from_library(v.data(), v.size());
    // shrink down to actual data. Note that no memory realocations or copy is done here.
    v.resize(size);
    std::sort(v.begin(), v.end());
    for (auto i : v) {
        cout << i << endl;
    }

When using C++11 or above we can even make get_data_from_library() to return a vector. Thanks to move operations, there will be no memory copy.

PooSH
  • 601
  • 4
  • 11
  • 2
    Then you can use regular pointers as iterators: `auto begin = data;` `auto end = data + size;` – PooSH Feb 10 '20 at 13:50
  • However, the question is where data returned by `get_data_from_library()` is allocated? Maybe we are not supposed to change it at all. If we need to pass a buffer to the library, then we can allocate vector and pass `v.data()` – PooSH Feb 10 '20 at 14:00
  • 1
    @PooSH the data is owned by the library, but it can be changed with no restriction (that's actually the point of the whole question). Only the size of the data cannot be changed. – Jabberwocky Feb 10 '20 at 14:03
  • 1
    @Jabberwocky added a better example of how to use the vector's underlying buffer to fill the data in. – PooSH Feb 10 '20 at 14:12
11

Now I'd like to use std::vector to access and modify these values in place

You cannot. That's not what std::vector is for. std::vector manages its own buffer, which is always acquired from an allocator. It never takes ownership of another buffer (except from another vector of same type).

On the other hand, you also don't need to because ...

The reason is that I need to apply algorithms from (sorting, swaping elements etc.) on that data.

Those algorithms work on iterators. A pointer is an iterator to an array. You don't need a vector:

std::sort(data, data + size);

Unlike function templates in <algorithm>, some tools such as range-for,std::begin/std::end and C++20 ranges do not work with just a pair of iterators though, while they do work with containers such as vectors. It is possible to create a wrapper class for iterator + size that behaves as a range, and works with these tools. C++20 will introduce such wrapper into the standard library: std::span.

eerorika
  • 232,697
  • 12
  • 197
  • 326
9

You can't do this with a std::vector without making a copy. std::vector owns the pointer it has under the hood and allocates space through the allocator that is provided.

If you have acess to a compiler that has support for C++20 you could use std::span which was built for exactly this purpose. It wraps a pointer and size into a "container" that has the C++ container interface.

If not, you can use gsl::span which is what the standard version was based off of.

If you don't want to import another library you could trivially implement this yourself depending on what all functionality you want to have.

NathanOliver
  • 171,901
  • 28
  • 288
  • 402
7

Besides the other good suggestion about std::span coming in and gsl:span, including your own (lightweight) span class until then is easy enough already (feel free to copy):

template<class T>
struct span {
    T* first;
    size_t length;
    span(T* first_, size_t length_) : first(first_), length(length_) {};
    using value_type = std::remove_cv_t<T>;//primarily needed if used with templates
    bool empty() const { return length == 0; }
    auto begin() const { return first; }
    auto end() const { return first + length; }
};

static_assert(_MSVC_LANG <= 201703L, "remember to switch to std::span");

Of special note is also the boost range library if you are interested in the more generic range concept: https://www.boost.org/doc/libs/1_60_0/libs/range/doc/html/range/reference/utilities/iterator_range.html.

Range concepts will also arrive in

darune
  • 10,480
  • 2
  • 24
  • 62
  • 1
    What's the `using value_type = std::remove_cv_t;` for? – Jabberwocky Feb 10 '20 at 14:56
  • 1
    ... and you forgot the constructor: `span(T* first_, size_t length) : first(first), length(length) {};`. I edited your answer. – Jabberwocky Feb 10 '20 at 14:59
  • @Jabberwocky I just used aggregate initialization. But constructor is fine. – darune Feb 11 '20 at 08:55
  • 1
    @eerorika i guess you are right, I removed the non-const versions – darune Feb 11 '20 at 08:59
  • 1
    The `using value_type = std::remove_cv_t;` is mainly needed if used with template programming (for getting the value_type of a 'range'). If you just want to use the iterators you can skip/remove that. – darune Feb 11 '20 at 09:04
  • `__cplusplus <= 201703L` is [portable ISO C++](https://en.cppreference.com/w/cpp/preprocessor/replace), no need for `_MSVC_LANG`. – Peter Cordes Feb 13 '20 at 07:15
  • @PeterCordes yes it is the portable way, but unfortunately MSVC does not enable this by default in my tests - hence making the check not particularly usefull since it can be (easily) skipped depending on configuration. – darune Jun 15 '20 at 08:28
6

You actually could almost use std::vector for this, by abusing the custom allocator functionality to return a pointer to the memory you want to view. That wouldn't be guaranteed by the standard to work (padding, alignment, initialization of the returned values; you'd have to take pains when assigning the initial size, and for non-primitives you'd also need to hack up your constructors), but in practice I would expect it to given enough tweaks.

Never ever ever do that. It's ugly, surprising, hacky, and unnecessary. The standard library's algorithms are already designed to work as well with raw arrays as with vectors. See the other answers for details of that.

Sneftel
  • 40,271
  • 12
  • 71
  • 104
  • 1
    Hmm, yes that could work with the [`vector` constructors](https://en.cppreference.com/w/cpp/container/vector/vector) that take a custom Allocator reference as a constructor arg (not just a template param). I guess you'd need an allocator object that had the runtime pointer value in it, not as a template parameter otherwise it could only work for constexpr addresses. You'd have to be careful not to let `vector` default-construct objects on `.resize()` and overwrite existing data; the mismatch between an owning container like vector vs. non-owning span is huge if you start using .push_back etc – Peter Cordes Feb 13 '20 at 07:29
  • 1
    @PeterCordes I mean, let's not bury the lede -- you'd *also* have to be crazy. To my mind, the strangest thing about the idea is that the allocator interface includes the `construct` method which would be required... I cannot think what non-hacky use cases would require that over placement-new. – Sneftel Feb 13 '20 at 08:25
  • 1
    The obvious use-case is to avoid wasting time constructing elements you're about to write some other way, e.g. in `resize()` before you pass a reference to something that wants to use it as a pure output (e.g. a read system call). In practice compilers often don't optimize away that memset or whatever. Or if you had an allocator that uses calloc to get pre-zeroed memory, you could also avoid dirtying it the way stupid `std::vector` does by default when default-constructing objects that have an all-zero bit pattern. See Notes in https://en.cppreference.com/w/cpp/container/vector/vector – Peter Cordes Feb 13 '20 at 08:35
4

As others have pointed out, std::vector must own the underlying memory (short of messing with a custom allocator) so can't be used.

Others have also recommended c++20's span, however obviously that requires c++20.

I would recommend the span-lite span. To quote it's subtitle:

span lite - A C++20-like span for C++98, C++11 and later in a single-file header-only library

It provides a non-owning and mutable view (as in you can mutate elements and their order but not insert them) and as the quote says has no dependencies and works on most compilers.

Your example:

#include <algorithm>
#include <cstddef>
#include <iostream>

#include <nonstd/span.hpp>

static int data[] = {5, 1, 2, 4, 3};

// For example
int* get_data_from_library()
{
  return data;
}

int main ()
{
  const std::size_t size = 5;

  nonstd::span<int> v{get_data_from_library(), size};

  std::sort(v.begin(), v.end());

  for (auto i = 0UL; i < v.size(); ++i)
  {
    std::cout << v[i] << "\n";
  }
}

Prints

1
2
3
4
5

This also has the added upside if one day you do switch to c++20, you should just be able to replace this nonstd::span with std::span.

Arghnews
  • 69
  • 2
  • 7
3

You could use a std::reference_wrapper available since C++11:

#include <iostream>
#include <iterator>
#include <vector>
#include <algorithm>

int main()
{
    int src_table[] = {5, 4, 3, 2, 1, 0};

    std::vector< std::reference_wrapper< int > > dest_vector;

    std::copy(std::begin(src_table), std::end(src_table), std::back_inserter(dest_vector));
    // if you don't have the array defined just a pointer and size then:
    // std::copy(src_table_ptr, src_table_ptr + size, std::back_inserter(dest_vector));

    std::sort(std::begin(dest_vector), std::end(dest_vector));

    std::for_each(std::begin(src_table), std::end(src_table), [](int x) { std::cout << x << '\n'; });
    std::for_each(std::begin(dest_vector), std::end(dest_vector), [](int x) { std::cout << x << '\n'; });
}
Jabberwocky
  • 48,281
  • 17
  • 65
  • 115
Robert Andrzejuk
  • 5,076
  • 2
  • 22
  • 31
2

A full implementation of ArrayView written in C++:

template<typename T>
class ArrayView {
public:
    using value_type = T;
    using const_iterator = const T*;

    ArrayView(T* ptr, size_t size) noexcept : ptr_(ptr), size_(size) {}
  
    template <typename U, size_t N>
    ArrayView(U (&buffer)[N]) noexcept : ArrayView(buffer, N) {}

    // ArrayView<T> to ArraryView<const T>
    // std::vector<T> to ArraryView<const T> or ArraryView<T>
    template <
        typename U,
        // Container has data and size
        typename std::enable_if<
            std::is_convertible<decltype(std::declval<U>().data()), T*>::value &&
            std::is_convertible<decltype(std::declval<U>().size()), std::size_t>::value
        >::type* = nullptr
    >
    ArrayView(const U& u) noexcept : ArrayView(u.data(), u.size()) {}

    T& operator[](int i) noexcept { return ptr_[i]; }
    T const& operator[](int i) const noexcept { return  ptr_[i]; }
    T* data() const noexcept { return ptr_; }
    size_t size() const noexcept { return size_; };

    T* begin() const noexcept { return this->data(); }
    T* end() const noexcept { return this->data() + this->size(); }
    const T* cbegin() const { return this->data(); }
    const T* cend() const { return this->data() + this->size(); }

    std::reverse_iterator<T*> rbegin() const {
        return std::make_reverse_iterator(end());
    }
    std::reverse_iterator<T*> rend() const {
        return std::make_reverse_iterator(begin());
    }
    std::reverse_iterator<const T*> crbegin() const {
        return std::make_reverse_iterator(cend());
    }
    std::reverse_iterator<const T*> crend() const {
        return std::make_reverse_iterator(cbegin());
    }

    ArrayView<T> subview(size_t offset, size_t size) const noexcept { 
        return offset < this->size() ? ArrayView<T>(this->data() + offset, std::min(size, this->size() - offset))
                                     : ArrayView<T>(nullptr, 0);
    }

    ArrayView<T> subview(size_t offset) const noexcept { 
        return subview(offset, this->size());
    }

private:
    T* ptr_;
    size_t size_;
};
Mark Cao
  • 156
  • 7