0

I am analyzing columns in one or more large data tables loaded from binary files. Every column can be one of several predefined types and is essentially a vector. I defined the column to be a variant of several vectors. In the toy example below, I use only two possible types (in my project, there are 6 or 7 types). Basically, I need a function that loads a single column from a given file. In order to avoid calling the copy constructor of a vector, I'd like to keep every loaded column in the heap memory, that is, have it as a unique_ptr.

I tried implementing such a function, but the C++ compiler cannot assign the loaded vector pointer to my variant pointer variable. I am giving an example code snippet below:

#include <cstdint>
#include <iostream>
#include <memory>
#include <string>
#include <variant>
#include <vector>

using column_t = std::variant<
    std::vector<std::uint32_t>,
    std::vector<double>>;

std::unique_ptr<column_t> load(const std::string& file_name) {
    column_t* result;
    // The column type is encoded in the file contents, but
    // in this toy example, we are not reading from the file
    switch (file_name.size() % 2) {
        case 0:
            result = new std::vector<std::uint32_t>();
            (std::get<std::vector<std::uint32_t>>(*result)).push_back(5);
            break;
        default: // 1
            result = new std::vector<double>();
            (std::get<std::vector<double>>(*result)).push_back(3.14);
    }
    return std::unique_ptr<column_t>(result);
}

inline std::size_t get_length(const column_t& v) {
    switch (v.index()) {
        case 0:
            return std::get<0>(v).size();
        default: // 1
            return std::get<1>(v).size();
    }
}

int main()
{
    std::unique_ptr<column_t> values = load("myfile");
    std::cout << "Loaded a vector of length "
              << get_length(*values.get()) << std::endl;
    return 0;
}

I know that I can rewrite the function to simply return a column_t object, however, using a smart pointer would be more intuitive in the downstream analysis tasks. Could anyone point me to the correct construct I need to use in order to assign a pointer (or directly a unique_ptr instance) to the variable result?

Yassen
  • 1
  • 1
    *In order to avoid calling the copy constructor of a vector, I'd like to keep every loaded column in the heap memory, that is, have it as a unique_ptr* FYI, `vector` can be moved which is pretty much equivalent to using a `unique_ptr` in that sense. – super Mar 12 '23 at 14:37
  • In your `get_length` function, I'd recommend using std::visit. – Coral Kashri Apr 01 '23 at 13:18

1 Answers1

0

This works as long as the first entry in the variant has a default constructor.

std::unique_ptr<column_t> load(const std::string& file_name) {
    std::unique_ptr<column_t> result = std::make_unique<column_t>();
    switch (file_name.size() % 2) {
        case 0:
            result->emplace<std::vector<std::uint32_t>>();
            (std::get<std::vector<std::uint32_t>>(*result)).push_back(5);
            break;
        default: // 1
            result->emplace<std::vector<double>>();
            (std::get<std::vector<double>>(*result)).push_back(3.14);
    }
    return result;
}

Alternatively you could write

std::unique_ptr<column_t> result;
...
result = std::make_unique<column_t>(
      std::in_place_type_t<std::vector<std::uint32_t>>{});

As noted in some comments, your concerns about copy constructors are somewhat unfounded because vectors have cheap move constructors. The move constructor of the variant itself is somewhat more expensive so I understand trying to avoid that, however, moving a vector into the variant is cheap and makes the code more readable.

std::unique_ptr<column_t> load(const std::string& file_name) {
    switch (file_name.size() % 2) {
        case 0: {
            std::vector<std::uint32_t> result;
            result.push_back(5);
            return std::make_unique<column_t>(std::move(result));
        }
        default: { // 1
            std::vector<double> result;
            result.push_back(3.14);
            return std::make_unique<column_t>(std::move(result));
        }
    }
}
Homer512
  • 9,144
  • 2
  • 8
  • 25