11

theano.scan return two variables: values variable and updates variable. For example,

a = theano.shared(1)

values, updates = theano.scan(fn=lambda a:a+1, outputs_info=a,  n_steps=10)

However, I notice that in most of the examples I work with, the updates variable is empty. It seems only when we write the function in theano.scan is a certain way, we get the updates. For example,

a = theano.shared(1)

values, updates = theano.scan(lambda: {a: a+1}, n_steps=10)

Can someone explain to me why in the first example the updates is empty, but in the second example, the updates variable is not empty? and more generally, how does the updates variable in theano.scan work? Thanks.

marmeladze
  • 6,468
  • 3
  • 24
  • 45
user5016984
  • 121
  • 1
  • 5

2 Answers2

12

Consider the following four variations (this code can be executed to observe the differences) and analysis below.

import theano


def v1a():
    a = theano.shared(1)
    outputs, updates = theano.scan(lambda x: x + 1, outputs_info=a, n_steps=10)
    f = theano.function([], outputs=outputs)
    print f(), a.get_value()


def v1b():
    a = theano.shared(1)
    outputs, updates = theano.scan(lambda x: x + 1, outputs_info=a, n_steps=10)
    f = theano.function([], outputs=outputs, updates=updates)
    print f(), a.get_value()


def v2a():
    a = theano.shared(1)
    outputs, updates = theano.scan(lambda: {a: a + 1}, n_steps=10)
    f = theano.function([], outputs=outputs)
    print f(), a.get_value()


def v2b():
    a = theano.shared(1)
    outputs, updates = theano.scan(lambda: {a: a + 1}, n_steps=10)
    f = theano.function([], outputs=outputs, updates=updates)
    print f(), a.get_value()


def main():
    v1a()
    v1b()
    v2a()
    v2b()


main()

The output of this code is

[ 2  3  4  5  6  7  8  9 10 11] 1
[ 2  3  4  5  6  7  8  9 10 11] 1
[] 1
[] 11

The v1x variations use lambda x: x + 1. the result of the lambda function is a symbolic variable whose value is 1 greater than the input. The name of the lambda function's parameter has been changed to avoid shadowing the shared variable name. In these variations the shared variable is not used or manipulated in any way by the scan, other than using it as the initial value of the recurrent symbolic variable incremented by the scan step function.

The v2x variations use lambda {a: a + 1}. The result of the lambda function is a dictionary that explains how to update the shared variable a.

The updates from the v1x variations is empty because we have not returned a dictionary from the step function defining any shared variable updates. The outputs from the v2x variations is empty because we have not provided any symbolic output from the step function. updates only has use if the step function returns a shared variable update expression dictionary (as in v2x) and outputs only has use if the step function returns a symbolic variable output (as in v1x).

When a dictionary is returned, it will have no effect if not provided to theano.function. Note that the shared variable has not been updated in v2a but it has been updated in v2b.

Daniel Renshaw
  • 33,729
  • 8
  • 75
  • 94
  • 3
    Thank you very much for the clear explanation. I just have one more thing to clarify: is it correct to say that in order for `theano.scan` to return non-empty updates, the lambda function has to have a dictionary of shared variable updates in the output? Thanks. – user5016984 Oct 07 '15 at 17:32
  • Is there a way to update multiple shared variables but in a defined order? – Stefan Falk Oct 23 '16 at 10:01
  • What I mean is I want to update e.g. three shared variables in one `fn`. Is this possible? I'm struggling for days now to get this running .. – Stefan Falk Oct 23 '16 at 10:04
  • @displayname I've not worked with Theano for a while so I may be rusty. You might be best asking a new question on SO. However, if I remember correctly, the order of the updates does not matter. If you want one update to affect another update then the second update needs to include the first update. They can then be passed to the function updates in any order. – Daniel Renshaw Oct 23 '16 at 10:56
  • @DanielRenshaw Well, I think I go the answer below by Guillaume. However, what I'm not so sure about is how to minimize the amount of memory write operations here ^^ I am implementing a RBM - or at least I try to - but I have a hard time understanding what Theano wants from me ^^ – Stefan Falk Oct 23 '16 at 11:03
6

To complement Daniel's answer, if you want to compute outputs and updates in theano scan at the same time, look at this example.

This code loops over a sequence, computing the sum of its elements and updates a shared variable t (length of the sentence)

import theano
import numpy as np

t = theano.shared(0)
s = theano.tensor.vector('v')

def rec(s, first, t):
    first = s + first
    second = s
    return (first, second), {t: t+1}

first = np.float32(0)

(firsts, seconds), updates = theano.scan(
    fn=rec,
    sequences=s,
    outputs_info=[first, None],
    non_sequences=t)

f = theano.function([s], [firsts, seconds], updates=updates, allow_input_downcast=True)

v = np.arange(10)

print f(v)
print t.get_value()

The output of this code is

[array([  0.,   1.,   3.,   6.,  10.,  15.,  21.,  28.,  36.,  45.], dtype=float32), 
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.], dtype=float32)]
10

rec function outputs a tuple and a dictionary. Scanning over a sequence will both compute the outputs and add the dictionary to the updates, allowing you to create a function updating tand computing firsts and seconds at the same time.

LeCodeDuGui
  • 221
  • 3
  • 7