21

I was reading the code for the logistic function given at http://deeplearning.net/tutorial/logreg.html. I am confused about the difference between inputs & givens variables for a function. The functions that compute mistakes made by a model on a minibatch are:

 test_model = theano.function(inputs=[index],
        outputs=classifier.errors(y),
        givens={
            x: test_set_x[index * batch_size: (index + 1) * batch_size],
            y: test_set_y[index * batch_size: (index + 1) * batch_size]})

validate_model = theano.function(inputs=[index],
        outputs=classifier.errors(y),
        givens={
            x: valid_set_x[index * batch_size:(index + 1) * batch_size],
            y: valid_set_y[index * batch_size:(index + 1) * batch_size]})

Why couldn't/wouldn't one just make x& y shared input variables and let them be defined when an actual model instance is created?

Franck Dernoncourt
  • 77,520
  • 72
  • 342
  • 501
user1245262
  • 6,968
  • 8
  • 50
  • 77

2 Answers2

23

The givens parameter allows you to separate the description of the model and the exact definition of the inputs variable. This is a consequence of what the given parameter do: modify the graph to compile before compiling it. In other words, we substitute in the graph, the key in givens with the associated value.

In the deep learning tutorial, we use a normal Theano variable to build the model. We use givens to speed up the GPU. Here, if we keep the dataset on the CPU, we will transfer a mini-batch to the GPU at each function call. As we do many iterations on the dataset, we end up transferring the dataset multiple time to the GPU. As the dataset is small enough to fit on the GPU, we put it in a shared variable to have it transferred to the GPU if one is available (or stay on the Central Processing Unit if the Graphics Processing Unit is disabled). Then when compiling the function, we swap the input with a slice corresponding to the mini-batch of the dataset to use. Then the input of the Theano function is just the index of that mini-batch we want to use.

Franck Dernoncourt
  • 77,520
  • 72
  • 342
  • 501
nouiz
  • 5,071
  • 25
  • 21
  • this is very helpful and clarifies the ideas i had of these concepts, thanks! – eickenberg Nov 13 '14 at 07:27
  • @nouiz - Thanks, let me see if I understand: 1. Use of 'given' is to improve Theano's memory management between CPU & GPU(s). 2. A 'given' will create(?) a shared variable *only if* it can be put into a GPU. 3. Unlike shared variables, given variables will not maintain state - I cannot create updates for them within the function & if they change outside of the function, there will be no effect on subsequent calls to that function....... Have I committed any errors of commission or omission?... Thx. (Sorry - accidentally entered 1st comment & then spent too long editing) – user1245262 Nov 13 '14 at 14:33
  • 1)No. It is shared variable that is useful to improve the GPU usage by doing less transfer. 2)No. given do not create shared variable. The user must create them himself. Given are not Theano variable and do not create Theano variable. They are there to modify the Theano graph defined by the inputs and outputs parameter passed to theano.function(). This is from functional programming principle. A program is just data like other data. You can change your program. given allow you to do this. – nouiz Nov 13 '14 at 14:57
  • @nouiz - OK - I was just noticing that the data sets were created as shared variables, so I was obviously wrong. Where can I read about how 'given' modifies the graph (I'll be searching the docs)? Right now, my only thought is that it it signals the Theano to build the graph so that the given variables are pushed to GPU(s) as possible. – user1245262 Nov 13 '14 at 15:02
  • @nouiz - OK. I think I found the answer here: https://groups.google.com/forum/#!topic/theano-users/wTYk_F-skW0 ... Shared variables are stored/used on the GPU. By declaring shared variables to be 'given' they will also be kept there. This way each call to the function will not require that the variables be restored to the GPU – user1245262 Nov 13 '14 at 15:09
  • 2
    You are closer, but not completly right. You do not need to use givens to store on the GPU. The CPU vs GPU is independent of given. What you describe is only useful for shared variable. givens are a way to modify the graph. The original graph wasn't build based on the shared variable. If the original graph was build on the shared variable, given won't be needed. So why do use use givens? Just to separate the building of the graph, for the dataset handling. – nouiz Nov 13 '14 at 18:06
1

I don't think anything is stopping you from doing it that way (I didn't try the updates= dictionary using an input variable directly, but why not). Remark however that for pushing data to a GPU in a useful manner, you will need it to be in a shared variable (from which x and y are taken in this example).

eickenberg
  • 14,152
  • 1
  • 48
  • 52