Lets say I create a theano function, how do I run operations in parallel elementwise on theano tensors like on matrices?
# This is in theano function. Instead of for loop, I'd like to run this in parallel
c = np.asarray(shape=(2,200))
for n in range(0,20):
# some example in looping this is arbitrary and doesn't matter
c[0][n] = n % 20
c[1][n] = n / 20
# in cuda, we normally use an if statement
# if (threadIdx.x === some_index) { c[0][n] = some_value; }
The question should be reformed, how do I do parallel operations in a Theanos function? I've looked at http://deeplearning.net/software/theano/tutorial/multi_cores.html#parallel-element-wise-ops-with-openmp which only talks about adding a setting, but does not explain how an operation is parallelized for element wise operations.