Scanning without sequences in theano? (emulating range())

Question

Theano newbie here. I am doing some experiments in order to generate variable length sequences. I started with the simplest thing coming to my mind: emulating range(). Here is the simple code I wrote:

from theano import scan
from theano import function
from theano import tensor as  T

X = T.iscalar('X')
STEP = T.iscalar('STEP')
MAX_LENGTH = 1024  # or any othe very large value

def fstep(i, x, t):
    n = i * t
    return n, until(n >= x)

t_fwd_range, _ = scan(
    fn=fstep,
    sequences=T.arange(MAX_LENGTH),
    non_sequences=[X, STEP]
)

getRange = function(
    inputs=[X, Param(STEP, 1, 'step')],
    outputs=t_fwd_range
)

getRange(x, step)
print list(f)
assert list(f[:-1]) == list(range(0, x, step))

So I had to use MAX_LENGTH as the length of a range to be used as input of the fstep theano scan. So, my main question is this: is there any way to use a scan without an input sequence? And, as I suppose the answer is no, the next question is: is this the correct (most efficient, ecc) way to do what I am traying to do?

score 1 · Accepted Answer · edited Sep 30 '15 at 15:13

There is no need to provide an input sequence to scan. You can instead specify the number of iterations via scan's n_steps parameter. Optionally, you can also specify a condition under which the scan should stop early via theano.scan_module.until.

So Python's range function can be emulated using Theano's scan without specifying an input sequence by figuring out how many iterations would be required to construct the requested sequence.

Here's an implementation of the range function based on Theano's scan. The only complicated part is figuring out how many steps are required.

import numpy
import theano
import theano.tensor as tt
import theano.ifelse


def scan_range_step(x_tm1, step):
    return x_tm1 + step


def compile_theano_range():
    tt.arange
    symbolic_start = tt.lscalar()
    symbolic_stop = tt.lscalar()
    symbolic_step = tt.lscalar()
    n_steps = tt.cast(
        tt.ceil(tt.abs_(symbolic_stop - symbolic_start) / tt.cast(tt.abs_(symbolic_step), theano.config.floatX)),
        'int64') - 1
    outputs, _ = theano.scan(scan_range_step, outputs_info=[symbolic_start], n_steps=n_steps,
                             non_sequences=[symbolic_step], strict=True)
    outputs = theano.ifelse.ifelse(tt.eq(n_steps, 0), tt.stack(symbolic_start), outputs)
    f = theano.function([symbolic_start, symbolic_stop, symbolic_step],
                        outputs=tt.concatenate([[symbolic_start], outputs]))

    def theano_range(start, stop=None, step=1):
        assert isinstance(start, int)
        assert isinstance(step, int)
        if step == 0:
            raise ValueError()
        if stop is None:
            stop = start
            start = 0
        else:
            assert isinstance(stop, int)
        if start == stop:
            return []
        if stop < start and step > 0:
            return []
        if stop > start and step < 0:
            return []
        return f(start, stop, step)

    return theano_range


def main():
    theano_range = compile_theano_range()
    python_range = range

    for start in [-10, -5, -1, 0, 1, 5, 10]:
        for stop in [-10, -5, -1, 0, 1, 5, 10]:
            for step in [-3, -2, -1, 1, 2, 3]:
                a = theano_range(start, stop, step)
                b = python_range(start, stop, step)
                assert numpy.all(numpy.equal(a, b)), (start, stop, step, a, b)


main()

Clearly this is a daft thing to do/use for real since Theano already provides a symbolic version of Python's range function, i.e. theano.tensor.arange. The built in implementation is also far more efficient than our scan version because it doesn't use scan, it uses a custom operation instead.

As a rule of thumb: you have to set a maximum number of iteration steps via the range or the the n_steps argument. You can set it to a very large number and then use theano.scan_module.until to stop the iteration at an earlier stage if your stop condition is met.

Hi Daniel, thanks for the reply. Even if not declaring explicitly a range, I still have to fix a certain number of step, so I assume there is no way to implement a sort of a `while` loop. I assume that it is for some sort of optimization: am I right? Thanks. — petrux, Sep 30 '15 at 15:00
Yes, it does seem as if you one must specify either an input sequence or a number of steps. It doesn't seem possible to rely on only the `until` condition. However, there may be no disadvantage to specifying a very large `n_steps` and using an `until` condition to stop early. — Daniel Renshaw, Sep 30 '15 at 15:01

Scanning without sequences in theano? (emulating range())

1 Answers1