Is the use of 'givens' really necessary in the deeplearning tutorials?

Question

In the deep learning tutorials, all training data is stored in a shared array and only an index into that array is passed to the training function to slice out a minibatch. I understand that this allows the data to be left in GPU memory, as opposed to passing small chunks of data as a parameter to the training function for each minibatch. In some previous questions, this was given as an answer as to why the givens mechanism is used in the tutorials.

I don't yet see the connection between these two concepts, so I'm probably missing out on something essential. As far as I understand, the givens mechanism swaps out a variable in the graph with a given symbolic expression (i.e., some given subgraph is inserted in place of that variable). Then why not define the computational graph the way we need it in the first place?

Here is a minimal example. I define a shared variable X and an integer index, and I either create a graph that already contains the slicing operation, or I create one where the slicing operation is inserted post-hoc via givens. By all appearances, the two resulting functions get_nogivens and get_tutorial are identical (see the debugprints at the end).

But then why do the tutorials use the givens pattern?

import numpy as np
import theano
import theano.tensor as T

X = theano.shared(np.arange(100),borrow=True,name='X')
index = T.scalar(dtype='int32',name='index')
X_slice = X[index:index+5]

get_tutorial = theano.function([index], X, givens={X: X[index:index+5]}, mode='DebugMode')
get_nogivens = theano.function([index], X_slice, mode='DebugMode')



> theano.printing.debugprint(get_tutorial)
DeepCopyOp [@A] ''   4
 |Subtensor{int32:int32:} [@B] ''   3
   |X [@C]
   |ScalarFromTensor [@D] ''   0
   | |index [@E]
   |ScalarFromTensor [@F] ''   2
     |Elemwise{add,no_inplace} [@G] ''   1
       |TensorConstant{5} [@H]
       |index [@E]

> theano.printing.debugprint(get_nogivens)
DeepCopyOp [@A] ''   4
 |Subtensor{int32:int32:} [@B] ''   3
   |X [@C]
   |ScalarFromTensor [@D] ''   0
   | |index [@E]
   |ScalarFromTensor [@F] ''   2
     |Elemwise{add,no_inplace} [@G] ''   1
       |TensorConstant{5} [@H]
       |index [@E]

score 0 · Answer 1 · answered Sep 25 '16 at 11:44

0

They use givens here only to decouple actual data which is passed to the graph from the input data variable. You could explicitly replace input variable with X[index * batch_size: (index + 1) * batch_size] but that is just a little more messy.

answered Sep 25 '16 at 11:44

justanothercoder

1,830
1
16
27

Is the use of 'givens' really necessary in the deeplearning tutorials?

1 Answers1