8

I have an array char* source and a vector std::vector<char> target. I'd like to make the vector target point to source in O(1), without copying the data.

Something along these lines:

#include <vector>

char* source = new char[3] { 1, 2, 3 };
std::vector<char> target;
target.resize(3);

target.setData(source); // <- Doesn't exist
// OR
std::swap(target.data(), source); // <- swap() does not support char*
delete[] source;

Why is it not possible to manually change where a vector points to? Is there some specific, unmanageable problem that would arise if this was possible?

NutCracker
  • 11,485
  • 4
  • 44
  • 68
Antti_M
  • 904
  • 10
  • 20
  • 9
    What would happen to size and capacity if you just swapped in some random pointer? – Retired Ninja Dec 30 '20 at 10:31
  • 4
    Rust has this as [`Vec::from_raw_parts`](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.from_raw_parts), C++ has no equivalent. Why do you want this? Is there a reason you cannot fill a vector directly with the data? Otherwise, unique_ptr is closer to what you have (and, if C++ ever supported this, would be the thing it accepts during construction). – GManNickG Dec 30 '20 at 10:33
  • @RetiredNinja That's definitely something to worry about. I would say, however, that C++ is not stranger to having to deal with potentially destructive memory stuff. In my example, I first resized the vector. – Antti_M Dec 30 '20 at 10:34
  • 3
    Not only size and capacity, how should `std::vector` delete that pointer? With `delete` or `delete[]`? And what if a custom allocator was specified for `std::vector`? – Evg Dec 30 '20 at 10:35
  • 1
    `std::vector_view` (hypothetical)? – Paul Sanders Dec 30 '20 at 10:37
  • 9
    There is [`span`](https://en.cppreference.com/w/cpp/container/span); I never used it because it's in C++20, but it looks like the correct class to use. – anatolyg Dec 30 '20 at 10:41
  • @Antti_M -- Create your own `std::span` class. As the previous comment suggested, that is basically what you want to accomplish -- you want all of the functionality of `std::vector` without resizing or memory management. – PaulMcKenzie Dec 30 '20 at 10:52
  • 3
    frame challenge : As long as you control the allocation (incl. re-allocation and de-allocation) for `source` in your original code, typically, this can be done the other way around : `std::vector v(N, '\0'); char* source = v.data(); /* fill source */`. – Sander De Dycker Dec 30 '20 at 10:57
  • For one thing, `target.setData(source);` uses camelCase for the function name, and the standard never does that. – Pete Becker Dec 30 '20 at 15:23
  • @Antti_M *Why can't we change data pointer of std::vector?* -- Assume you could do this -- what happens here? `int *ptr = (int *)GlobalAlloc(50);` -- The `GlobalAlloc` function is a Windows API function that allocates memory. What do you think will happen if you gave `ptr` to `std::vector` to handle? Sparks will be flying out your computer. – PaulMcKenzie Dec 30 '20 at 16:23

3 Answers3

7

C++ vector class supports adding and deleting elements, with guaranteed consecutive order in memory. If you could initialize your vector with existing memory buffer, and add enough elements to it, it would either overflow or require reallocation.

The interface of vector assumes that it manages its internal buffer, that is, it can allocate, deallocate, resize it whenever it wants (within spec, of course). If you need something that is not allowed to manage its buffer, you cannot use vector - use a different data structure or write one yourself.

You can create a vector object by copying your data (using a constructor with two pointers or assign), but this is obviously not what you want.

Alternatively, you can use string_view, which looks almost or maybe exactly what you need.

anatolyg
  • 26,506
  • 9
  • 60
  • 134
  • That's what a vector has to do anyways, right? In this use case, you would have to of course resize the vector first, but after that I don't see a problem. – Antti_M Dec 30 '20 at 10:43
  • 1
    Not sure what you mean by "that"; anyway, you are trying to take away from `vector` something that it wants to support (reallocation); in your use-case this may make sense, but in general this is not supported by the language. – anatolyg Dec 30 '20 at 10:49
  • 2
    @Antti_M If you resize the vector, it internally allocates memory. It does not make sense then to throw this memory away and replace it with an externally provided buffer. – Daniel Langr Dec 30 '20 at 10:49
3

std::vector is considered to be the owner of the underlying buffer. You can change the buffer but this change causes allocation i.e. making a copy of the source buffer which you don't want (as stated in the question).

You could do the following:


#include <vector>

int main() {
    char* source = new char[3] { 1, 2, 3 };
    std::vector<char> target;
    target.resize(3);
    target.assign(source, source + 3);
    delete[] source;
    return 0;
}

but again std::vector::assign:

Replaces the contents with copies of those in the range [first, last).

So copy is performed again. You can't get away from it while using std::vector.

If you don't want to copy data, then you should use std::span from C++20 (or create your own span) or use std::string_view (which looks suitable for you since you have an array of chars).

1st option: Using std::string_view

Since you are limited to C++17, std::string_view might be perfect for you. It constructs a view of the first 3 characters of the character array starting with the element pointed by source.

#include <iostream>
#include <string_view>

int main() {
    char* source = new char[3] { 1, 2, 3 };

    std::string_view strv( source, 3 );

    delete[] source;

    return 0;
}

2nd option: Using std::span from C++20

std::span comes from C++20 so it might not be the most perfect way for you, but you might be interested in what it is and how it works. You can think of std::span as a bit generalized version of std::string_view because it is a contiguous sequence of objects of any type, not just characters. The usage is similar as with the std::string_view:

#include <span>
#include <iostream>

int main() {
    char* source = new char[3] { 1, 2, 3 };

    std::span s( source, 3 );

    delete[] source;

    return 0;
}

3rd option: Your own span

If you are limited to C++17, you can think of creating your own span struct. It might still be an overkill but let me show you (btw take a look at this more elaborated answer):

template<typename T>
class span {
   T* ptr_;
   std::size_t len_;

public:
    span(T* ptr, std::size_t len) noexcept
        : ptr_{ptr}, len_{len}
    {}

    T& operator[](int i) noexcept {
        return *ptr_[i];
    }

    T const& operator[](int i) const noexcept {
        return *ptr_[i];
    }

    std::size_t size() const noexcept {
        return len_;
    }

    T* begin() noexcept {
        return ptr_;
    }

    T* end() noexcept {
        return ptr_ + len_;
    }
};

int main() {
    char* source = new char[3] { 1, 2, 3 };

    span s( source, 3 );

    delete[] source;

    return 0;
}

So the usage is the same as with the C++20's version of std::span.

NutCracker
  • 11,485
  • 4
  • 44
  • 68
  • 1
    Thanks for the answer! However, I was not really struggling to work around this. Instead, I was interested in precicely what I asked: Why can we not change the underlying pointer? Upvoted, but not quite the accepted answer I was looking for. :) – Antti_M Dec 30 '20 at 13:18
-1

The std::string_view and std::span are good things to have (if you have compiler version supporting them). Rolling your own similars is ok too.

But some people miss the whole point why one would want to do this exactly to a vector:

  • Because you have an API that gives Struct[] + size_t and give you ownership
  • and you also have an API that accepts std::vector<Struct>

ownership could be easily transferred into the vector and no copies made!

You can say: But what about custom allocators, memory mapped file pointers, rom memory that I could then set as the pointer?

  • If you are already about to set vector internals you should know what you are doing.
  • You can try to supply a "correct" allocator in those cases to your vector actually.

Maybe give a warning on compiling this code yes, but it would be nice if it would be possible.

I would do it this way:

  • std::vector would get a constructor that asks for a std::vector::raw_source
  • std::vector::raw_source is an uint8_t*, size_t, bool struct (for now)
  • bool takeOwnership: tells if we are taking ownership (false => copy)
  • size_t size: the size of the raw data
  • uint8_t* ptr: the pointer to raw data
  • When taking ownership, vector resize and such uses the vectors allocation strategy as you otherwise provided with your template params anyways - nothing new here. If that does not fit the data you are doing wrong stuff!

Yes API I say look more complicated than a single "set_data(..)" or "own_memory(...)" function, but tried to make it clear that anyone who ever uses this api pretty much understands implications of it automagically. I would like this API to exists, but maybe I still overlook other causes of issues?

prenex
  • 24
  • 3