1

I have a simple but large data file. It's output from a neural network simulation. The first column is a time step, 1..200. The second is the target word (for the current simulation, 1..212). Then there are 212 columns, one for each word. That is, each row has the activation values of each word node at a particular time step given a particular target (input) word.

I need to do simple operations, such as converting each activation to a response strength (exp(constant x activation)) and then dividing each response strength by the row sum of response strength. Doing this in R is very slow (20 minutes), and doing it with conventional looping in perl is faster but still slow (7 minutes) given that later simulations will involve thousands of words.

It seems like PDL should be able to do this much more quickly. I've been reading the PDL documentation, but I'm really at a loss for how to do the second step. The first one seems as easy as selecting just the activation columns and putting them in $act and then:

$rp = exp($act * $k);

But, I can't figure out how then to divide each value by its row sum. Any advice would be appreciated.

toolic
  • 57,801
  • 17
  • 75
  • 117
user20412
  • 193
  • 7

2 Answers2

2

As is often the case in PDL, a good answer to this involves slicing and indices.

$k = 0.7; # made-up value
$data = zeroes 214,200;
$data((0)) .= sequence(200) + 1; # column 0=1..200
$data((1)) .= indx(zeroes(200)->random*212) + 1; # column 1 randomly 1..212
$data(2:-1)->inplace->random; # rest of columns random values for this demo
$indices = ($data(1)+1)->append($data((0))->sequence->transpose); # indices are [column 1 value,row index]
$act = $data->indexND($indices); # vector of the activation values
$rp = exp($act * $k);
$rp /= $data(2:-1)->sumover; # divide by sum of each row's non-index values
Ed.
  • 1,992
  • 1
  • 13
  • 30
1

It looks like you need to make a copy of the matrix, then use the first one to read from, and the second to write too. NOTE using $c++ instead of the for $loop() { might be more efficient ! }

$x = sequence(3,3)*2+1;
 [ 1  3  5]
 [ 7  9 11]
 [13 15 17]
$y .= $x; # if you use = here it will change both x and y 
for $c(0..2) { for $d(0..2) {  $y($c,$d) .= $y($c,$d) / sum($x(,$d))  }} 
p $y;
  [0.11111111 0.33333333 0.55555556]
  [0.25925926 0.33333333 0.40740741]
  [0.28888889 0.33333333 0.37777778]
Mark Baker
  • 21
  • 4