6

I have a std::vector<std::vector<double>> where I want to conver it into a torch::Tensor in libtorch. However it seems, the torch::tensor(), or torch::from_blob(), can't be used for this purpose!

I tried to use c10::ArrayRef and then use that for converting the data into a torch::Tensor by doing c10::ArrayRef<std::vector<std::vector<double>>> res(myvecs) but this also seems useless as I can't seem to find a way to convert it into torch::Tensor.

How should I go about this conversion in libtorch? What are my other options other than e.g:

auto tensor = torch::zeros({ 46,85 });
for (size_t i = 0; i < 46; i++)
{
   for (size_t j = 0; j < 85; j++)
   {
       tensor[i][j] = probs[i][j];
   }
}
halfer
  • 19,824
  • 17
  • 99
  • 186
Hossein
  • 24,202
  • 35
  • 119
  • 224

2 Answers2

6

the easiest way would be to use a simple std::vector<double> instead of a vector of vectors. You would have contiguous memory and torch::from_blob would work (as mentionned in the other answer).

If that is not possible/convenient, I suggest the following workaround. I assume that your vector is a (n,m) matrix (i.e all the n vectors have the same size m) :

int n = 5, m = 4;
// Just creating some dummy data for example
std::vector<std::vector<double>> vect(n, std::vector<double>(m, 0)); 
for (int i = 0; i < n; i++)
    for (int j = 0; j < m; j++)
        vect[i][j] = i+j;

// Copying into a tensor
auto options = torch::TensorOptions().dtype(at::kDouble);
auto tensor = torch::zeros({n,m}, options);
for (int i = 0; i < n; i++)
    tensor.slice(0, i,i+1) = torch::from_blob(vect[i].data(), {m}, options);

Edit : you may need to add a call to clone in case where you cannot ensure that the vector will outlive the tensor (because from_blob do not take ownership, so its data will be erased when the vector is destroyed)

trialNerror
  • 3,255
  • 7
  • 18
  • Thanks alot how does this compare to the simple for loop? is there any paralelization /optimization going on here? or are they just the same performance wise – Hossein Aug 18 '20 at 11:34
  • Thanks got it. completely forgot about the `from_blob` that apparently. greatly appreciate it – Hossein Aug 18 '20 at 11:35
  • 1
    Well, you would need a benchmark to be sure of that, but I believe it should be faster than a manually made for loop. Most torch operations rely on BLAS, which are much more efficient than anything you would do by hand (however this is a very simple operation : one vector copied in another, so BLAS optimization is not that big here). In addition to this, if you first copied the 2D vector in a 1D vector to then used `from_blob`, each element would be copied twice instead of only once in my answer. – trialNerror Aug 18 '20 at 11:39
  • Thanks, really appreciate it. by the way, if I were to generalize your method for vectors of higher dimensions such as e.g shape = (1,2,4,6). in that case, how should I be going about it? should I first be flattening them into 1d and then go for from_blob? – Hossein Aug 18 '20 at 12:03
  • 1
    In such situation, the easy generalization would be to loop over the first 3 dimensions (instead the 1 for loop I have here) and slice as I did here. However, I believe using vectors of vectors of vectors ... would get really ugly really fast, so you'd probably better think of using another data stucture, with contiguous memory. Like a flat 1D vector for example – trialNerror Aug 18 '20 at 12:21
3

I have not used any of the libraries that you mention, but if I should guess then the libraries probably expect a continuous array and not small segments of memory scattered around the heap.

So convert the std::vector<std::vector<double>> to std::vector<double> and pass the vec.data() pointer to torch

std::vector<double> linearize(const std::vector<std::vector<double>>& vec_vec) {
    std::vector<double> vec;
    for (const auto& v : vec_vec) {
        for (auto d : v) {
            vec.push_back(d);
        }
    }
    return vec;
}
Mestkon
  • 3,532
  • 7
  • 18