4

Suppose I have a PyTorch Variable in GPU:

var = Variable(torch.rand((100,100,100))).cuda()

What's the best way to copy (not bridge) this variable to a NumPy array?

var.clone().data.cpu().numpy()

or

var.data.cpu().numpy().copy()

By running a quick benchmark, .clone() was slightly faster than .copy(). However, .clone() + .numpy() will create a PyTorch Variable plus a NumPy bridge, while .copy() will create a NumPy bridge + a NumPy array.

Fábio Perez
  • 23,850
  • 22
  • 76
  • 100

2 Answers2

3

This is a very interesting question. According to me, the question is little bit opinion-based and I would like to share my opinion on this.

From the above two approaches, I would prefer the first one (use clone()). Since your goal is to copy information, essentially you need to invest extra memory. clone() and copy() should take a similar amount of storage since creating numpy bridge doesn't cause extra memory. Also, I didn't understand what you meant by, copy() will create two numPy arrays. And as you mentioned, clone() is faster than copy(), I don't see any other problem with using clone().

I would love to give a second thought on this if anyone can provide some counter arguments.

Wasi Ahmad
  • 35,739
  • 32
  • 114
  • 161
0

Because clone() is recorded by AD second options is less intense. There are few options you may also consider.

prosti
  • 42,291
  • 14
  • 186
  • 151