I'm trying to find the best way to fund the cumulative product of a Gorgonia Tensor. For example:
T := ts.New(ts.WithShape(3,4), ts.WithBacking(ts.Range(ts.Float32,1,13)))
for row := 1; row < 3; row++ {
for col :=0; col < 4; col++ {
val1, _ := T.At(row-1, col).(float32)
val2, _ := T.At(row, col).(float32)
newVal := val1*val2
T.SetAt(newVal, row, col)
}
}
I want the output to be:
[1 2 3 4 ]
[5 12 21 32 ]
[45 120 231 384]
This works, but for modest sized tensors (500x50), it takes 2500 microseconds seconds per iteration. I need it to be much faster. The numpy .cumprod() takes about 10-20 microseconds for a similar computation. Is there a more efficient way to code this?
I'm using tensors because I need to do matrix multiplication, and I want to use float32 for memory constraints. If necessary, I could switch it to gonum matrices, but I would rather not have to spend the extra memory if there is a way to do it with float32.