3

Suppose I have the following sample code just to instantiate the insert method of std::vector for at least two trivial types:

#include <vector>

void insert(std::vector<int>& v, int const* a, int const* b)
    { v.insert(v.end(), a, b); }

void insert(std::vector<short>& v, short const* a, short const* b)
    { v.insert(v.end(), a, b); }

If you compile this code, you will get nearly identical copies of the same code (see how it compiles here).

Is it possible to achieve more compact code without rolling your own specialized std::vector which assumes that T is trivial? So it is possible to throw away the type in the true implementation, so the heavy work can be implemented by using memcpy and realloc directly. I guess this solution according to the standard is UB.

As an example, the compiler has not merged the two instantiations of uninitialized_copy. I realize that it is not allowed to map two different symbols to the same address, but couldn't it at least replace the second copy with jump to the first. This does not even happen with -Os.

yacc
  • 2,915
  • 4
  • 19
  • 33
user877329
  • 6,717
  • 8
  • 46
  • 88
  • Would proxying the vectors own `emplace_back` do the trick? – Ted Lyngmo May 31 '19 at 20:33
  • BTW a compiler could simply generate a special entry point for the sole purpose of taking the address of a function while the real entry for calling would be merged. A function code could start with many NOP to have as many entry points as needed. – curiousguy May 31 '19 at 22:14
  • 2
    `std::vector` and `std::vector` are two completely independent classes that have nothing to do with each other. It just so happens that the compiler will generate identical code for the same-named methods of each class. Unfortunately, the state of the compiler technology hasn't reached the stage where the compiler realizes and optimizes it away. So, the onus is on you to only use one or the other type, and there are some things that can be done where you mostly think you have two different classes, but really have only one. But that's not what you're asking about. – Sam Varshavchik May 31 '19 at 22:22
  • @SamVarshavchik But the linker could. – curiousguy May 31 '19 at 22:28
  • 1
    Only if the linker can prove that it's byte-identical code, and any function calls are also to byte-identical code. That last requirement will be a deal-breaker, here. – Sam Varshavchik May 31 '19 at 22:31
  • @SamVarshavchik Wouldn't it be possible to merge identical functions right before any inlining takes place: Put each function body in a set, and dump the set afterwards. It would be easy to do if each function is compiled in its own context, so jump labels does not affect the result. Maybe the latter would make other optimizations harder, would it? – user877329 Jun 01 '19 at 05:50
  • @SamVarshavchik "So, the onus is on you to only use one or the other type". How to do this in practice. My code has instantiations for the complete set of primitive types:8 integral types and 2 floating point types. In addition to that, there are vec4:s increasing the number further. – user877329 Jun 01 '19 at 05:54
  • Hint: what is `size_t`? It's not an integer type. It's an alias for whatever integer type best represents an object's size, on a particular platform. Instead of using "8 integer type", define meaningful labels for what these "8 integer types" are supposed to represent. If, for example, they are aliases for only four actual integer types, now you only end up with four instances of each template functions. C++11 even provides these aliases for you: `int8_t`, `int16_t`, `int32_t`, and `int64_t`. If you only use those, you will never instantiate two templates for a 16-bit integer. – Sam Varshavchik Jun 01 '19 at 12:54
  • @SamVarshavchik Sort of true, but *any* builtin type will produce exactly the same code. But any .int[0-9]*_t produces the same code, so there are 8 identical instantiations. Add to this two or three if you add a trivial instantiation of half (ILM:s implementation is currently not, but should be), and you have 11. Then add vec4_t, and possibly also std::byte. 13 instantiations of exactly the same code! – user877329 Jun 01 '19 at 16:09

0 Answers0