1

I'm a newbie in mshadow, I can not understand why I got those outpus from the following code snippet:

TensorContainer<cpu, 2> lhs(Shape2(2, 3));
lhs = 1.0;
printf("%u %u\n", lhs.size(0), lhs.size(1));
printf("%u %u\n", lhs[0].shape_[0], lhs[0].shape_[1]);
printf("%u %u\n", lhs[0].size(0), lhs[0].size(1));

The output is:

2 3
3 4
3 3

Why are the second and third outputs those numbers? Because lhs[0] is one-dimensional, I think they should be exactly the same, i.e. 3 0. Could anyone tell me where I was wrong? Thanks in advance!

ROBOT AI
  • 1,217
  • 3
  • 16
  • 27

1 Answers1

0

You are right, Tensor lhs[0] is one dimensional, but to answer you question first let me show what is going on under the hood. TensorContainer does not override the [] operator, instead it uses the one from the parent (which is Tensor), more precisely the following one is called:

  MSHADOW_XINLINE Tensor<Device, kSubdim, DType> operator[](index_t idx) const {
    return Tensor<Device, kSubdim, DType>(dptr_ + this->MemSize<1>() * idx,
                                          shape_.SubShape(), stride_, stream_);
  }

As can be seen it creates a new Tensor on a stack. And while for the most of the cases it will create generic N-dimensional Tensor, here for the 1-dimensional case it will create a special 1-dimensional Tensor.

Now ,when we have established what exactly is returned by the operator [], let's look on the fields of that class:

  DType *dptr_;
  Shape<1> shape_;
  index_t stride_;

As can be seen the shape_ here has only 1 dimension! so there is no shape_1, instead by calling shape_1 it will return stride_(or part of it). Here is the modification to the Tensor constructor that you can try to run and see what is actually going on there:

  MSHADOW_XINLINE Tensor(DType *dptr, Shape<1> shape,
                         index_t stride, Stream<Device> *stream)
      : dptr_(dptr), shape_(shape), stride_(stride), stream_(stream) {
     std::cout << "shape[0]: " << shape[0] << std::endl; // 3
     std::cout << "shape[1]: " << shape[1] << std::endl; // 0, as expected
     std::cout << "_shape[0]: " << shape_[0] << std::endl; // 3, as expected
     std::cout << "_shape[1]: " << shape_[1] << std::endl; // garbage (4)
     std::cout << "address of _shape[1]: " << &(shape_[1]) << std::endl;
     std::cout << "address of stride: " << &(stride_) << std::endl;
  }

and the output:

shape[0]: 3
shape[1]: 0
_shape[0]: 3
_shape[1]: 4
address of _shape[1]: 0x7fffa28ec44c
address of stride: 0x7fffa28ec44c

_shape1 and stride have both the same address (0x7fffa28ec44c).