-1

I'm introducing myself to C++, and sadly it's starting to seem like the support for dynamically created arrays of fixed size (but with the size known only at run time) is very poor in C++, as new[] can't call an arbitrary user-specified constructor with user-set arguments.

Consider class A which has a number of constructors, each with some parameters. Assume that a constructor without parameters would be useless (I don't want to have to write one if I essentially don't need it). I guess the following doesn't matter, but, just in case: assume that A contains only a possibly large std::vector<Internal> (Internal is a private class, T and S parameterize A) and an integer counter as far as data members go. Also, A is parameterized.

Assume we want n instances of A stored contiguously in memory as an array, where n is determined at run time and constant afterwards. We want to be able create and initialize the structure with a single call that passes arguments to a constructor of A, or something similar. So each instance in the array gets the same, but programmatic initialization. EDIT: sorry, I didn't mean to say I want O(1) initialization, as that's impossible, I just wanted O(n) initialization, but so that I can create the array in one statement. I.e., so that I don't have to write an initialization loop for every array I create.

A possible, but suboptimal, solution is std::vector<A<T,S>>, but assume we can't live with the inefficiency. (Remember that std::vector supports resizing.)

How to implement and/or use an efficient solution with a nice API?

I would prefer a solution that doesn't reimplement half of the standard library, i.e. consider C++20 features and the standard library available for the implementation. Also, don't make me violate the C++ aliasing rules.

A possibly related question is why is such a "fixed_size_vector" class missing from the standard library?

(BTW: not that it matters, but please don't say "just use vector", because in this case I'm indeed going to go with the mentioned suboptimal solution, as the performance is not significant for my toy program, but in the real world the performance will matter one day and I want to be prepared. EDIT: I did not mean I want to optimize my toy program, rather I was referring to the fact that one day I will have to optimize some other program.)

EDIT: answering to some commenters: wrapping std::vector could provide the right abstraction, but it would be unnecessarily inefficient. A comment linked a question whose top answer explains this nicely:

dynarray is smaller and simpler than vector, because it doesn't need to manage separate size and capacity values, and it doesn't need to store an allocator

(dynarray here was a proposed addition to stdlib that seems to be what I wanted, except that it was also supposed to rely on special compiler support for some of its semantics). Of course, this difference compared to std::vector won't matter most of the time, but it would still be good if I was able to simply use the right tool for the job.

user2373145
  • 332
  • 2
  • 14
  • Given that there are `n` instances, it is logically impossible for them to be constructed "with a single call". By definition there will be exactly `n` calls to the class's constructor. Or, construct one object, and use the overloaded `std::vector` constructor to copy-construct `n` instances of the object. Mission accomplished. – Sam Varshavchik Nov 06 '20 at 19:01
  • 1
    Just use `std::vector` (even though you said not to - it works). – n. m. could be an AI Nov 06 '20 at 19:04
  • 2
    *please don't say "just use vector"* But that is the solution. It has all the interface that you want/need. If your only concern is someone could expand the vector, then wrap `vector` in your own class and don't provide any of the functions that could increase the size of the vector. You could also just create a `const vector` and construct it with all of the objects you want in it. The objects in the vector will not be `const`, but since the vector is you'll get an error if you try and change the state of the vector itself. – NathanOliver Nov 06 '20 at 19:05
  • 1
    _"in the real world the performance will matter one day and I want to be prepared"_ You are absolutely going to want to drop that mentality. Real world code is about maintainability and scalability _first_. You don't program for computers, you program for people. Optimizations matter for bottlenecks and obvious cleanups; but if you try to micro-optimize everything you will not have a very long career as a C++ developer – Human-Compiler Nov 06 '20 at 19:16
  • Also it's generally **not** desirable to pass all the same arguments to N instances of `T`'s constructor. It might work if you want to _copy_ everything, but this would be horribly broken with move semantics. For example, passing `unique_ptr` would pass a possibly valid pointer only to the first instance, and `nullptr` to all later instances. This feature doesn't exist for a good reason – Human-Compiler Nov 06 '20 at 19:19
  • 3
    **Regarding your edit:** You appear to not understand what efficiency actually is and how it relates to `std::vector`. You can reserve space in a vector up-front which becomes equivalent to allocating storage for `n` objects _before you call the constructors_. I suggest you actually learn what `std::vector` is capable of doing and how it works rather than discounting **the correct answer** – Human-Compiler Nov 06 '20 at 19:32
  • 1
    There once was [`std::dynarray`](https://stackoverflow.com/q/19111028/8586227), which was kore or less what you want (albeit with additional magic that both justified its distinction from `std::vector` and helped get it killed). – Davis Herring Nov 06 '20 at 19:54
  • _"but it would still be good if I was able to simply use the right tool for the job."_ `std::vector` has been, and continues to be, the right tool for the job. I'm not sure why you appear to wilfully misunderstand this point. Requiring storage for a capacity and size doesn't make `std::vector` inefficient. Most hardware out there these days does _not_ need to save the 8 bytes that this would cost, and at worst `std::pmr::vector` can be used to control _where_ the allocation lives. – Human-Compiler Nov 06 '20 at 20:26
  • 2
    "but assume we can't live with the inefficiency. (Remember that std::vector supports resizing.)"; if you don't use it, you don't pay for it. – Jose Nov 06 '20 at 20:32
  • @Jose see the quote about dynarray at the end of the question. ```std::vector```'s ability to resize does carry a cost with it. Of course, that cost will not be significant for most applications, but that's not really relevant here. – user2373145 Nov 06 '20 at 20:38
  • @user2373145 Actually that is **exactly** what's relevant here. A `std::dynarray`-like object _cannot exist_ in C++ without changing the language, and all it buys you is saving a few bytes used for the internal pointers of `vector` on the stack that you then immediately trade off by allocating all the storage on the stack instead. This makes it _much easier_ to produce a stack overflow. If you want it on the stack, use an allocator -- and this can be done without language changes at the cost of a few bytes. If those few bytes are important, you're probably not using `std::vector` anyway – Human-Compiler Nov 06 '20 at 20:42
  • Yeah, I know, I didn't want stack allocation, rather I just used ```dynarray``` as an example for how less data members would be necessary in the "fixed_size_vector" class compared to ```std::vector``` – user2373145 Nov 06 '20 at 20:52
  • At _most_ you would have 1-less pointer member in a `fixed_size_vector` -- and that's only if you restrict the flexibility of the class such that you must call the same constructor on all entries during the class's construction, _or_ pass in the entries you want constructed such as through an initializer list. It's possible, but it's a horribly restrictive design -- all at the cost of 1 internal pointer. You seem to be conflating the concept of efficiency with storage size. – Human-Compiler Nov 06 '20 at 21:17

2 Answers2

0

There is a proposal to add a fixed capacity vector to the standard.
Note that this proposal proposes the capacity be known at compile-time, so it's not applicable in your case.

There are also some open source libraries that implement one, e.g., Boost's static_vector, or . If you really want a fixed-capacity vector, you can use one of the open source implementations that exist out there.
If you really know what you're doing, you could write one on your own, but that's not the case for >99% of C++ users.

However, it should be noted that reserve()ing space on a vector will probably have the effect you want, and there's probably no need for an actual fixed capacity vector.

Omry
  • 306
  • 1
  • 8
  • The proposal you linked wouldn't be applicable, as OP states that the size is only known at *runtime*. The only real solution for OP is to `reserve` the space up-front – Human-Compiler Nov 06 '20 at 19:37
0

Since you mention that the size is only known at runtime this is exactly what std::vector is meant to be used for.

template <typename T, typename...Args>
auto make_vector(std::size_t size, const Args&...args) -> std::vector<T>
{
    auto result = std::vector<T>{};
    result.reserve(size); // whatever the known size is
    for (auto i = 0; i < size; ++i) {
        result.emplace_back(args...);
    }
    return result;
}

// Use like:

auto vec = make_vector<std::string>(20, "hello world");

This will pre-allocate enough room for size entries of type T, and the loop will call T's constructor with whatever arguments you pass it.

Be aware that:

  • No additional constructors are called.
  • No extra memory is used.
  • No copies or relocations are performed.
  • The returned vector is not copied (or even moved) with or above thanks to guaranteed copy elision.

Doing this is as optimal as you can get whether you use a specialized container or otherwise. This is why every experienced C++ developer will tell you the same thing: std::vector is the solution.[2]

Note: The above function uses const Args&... for propagation and not proper forwarding references, since rvalue references could result in use-after-move bugs.[1]


A specialized container like a fixed_size_vector that you mention will either be one of two things:

  • Fixed at compile-time on the max size, in which case it wouldn't work for you since you mentioned the size is only known at runtime
  • Fixed at runtime on the max size, in which case it will do exactly what I suggested above, since it will reserve the storage space up-front.

It is not possible at the language level to dynamically construct N objects only known at runtime using a custom constructor. Full stop. This could be done if the sequence is known at compile-time, but not runtime.

C++ is statically compiled, so we cannot variadically expand a runtime n value into a pack of T{...} constructor calls; it's simply not possible. This means there will be a loop every time. Thus the most optimal thing you can do is allocate n objects once, and call T's constructor n times.


[1] A short-hand syntax for passing a list of arguments to all of a sequences constructors is not a good general solution in C++. In fact, it would be suboptional. This would either force copies via const lvalue references, or it would allow for rvalues -- in which case only the first object constructed will get a valid value, and everything after will receive a use-after-moved object! Just imagine unique_ptr to a sequence of T's. Only the first instance will get a valid pointer, and everything else will receive nullptr

[2] Honestly, about the only real optimization you might be able to make on this solution would be to use a custom allocator, such as a std::pmr::vector with a stack-allocated memory buffer resource.


Footnote

I strongly advise you to get over the "efficiency first" mentality. Most developers' intuition on what is and is not efficient is wrong; this is why profilers are so important. Things like speculative execution, cache locality, and pipelining play a huge role in performance -- and these things are far more complex than simply constructing a dynamic array of objects.

Real software is written for other developers, not for the machine. It's better to have code that is maintainable and scalable, and optimized in places where bottlenecks have been identified through proper tooling.

Human-Compiler
  • 11,022
  • 1
  • 32
  • 59
  • "This would either force copies via const lvalue references" - I think this is what I want, as the constructors take arguments that are cheap to copy? – user2373145 Nov 06 '20 at 20:05
  • Regarding the footnote: a lot of you here are telling me I should get over the "efficiency first" mentality, or the like, but note that I *did* ask for a solution with a nice API. The ```reserve``` call, e.g. is one extra statement that is not in essence necessary. I'm not just putting efficiency over everything else, I just want the proper tool for the job (when the job arises, at least). – user2373145 Nov 06 '20 at 20:10
  • You made several edits regarding "efficiency", not clean APIs. I've seen a lot of developers fall victim to the trap of "efficiency first". It leads to brittle over-engineered unmaintainable solutions that were added to solve a specific problem, but actually produced more problems. It's a bad mentality to have. At any rate, I've updated my answer to contain a simple API that satisfies your request. – Human-Compiler Nov 06 '20 at 20:17