0

I'm trying to vectorize/broadcast (Not sure what it is called formally) my code in order to make it faster, but I can't quite get it. What I think I should be using is numpy.cumsum (with axis=0) but I don't know how to get it (fast) in the correct array to use it.

What I want with this code is basically the absolute sum of l1 for adding each element from l2 to all numbers in l1. So this gives not one answer, but len(l2) amount of answers. The (non-vectorized) code below give the correct output.

    # l1 and l2 are numpy arrays
    for i in l2:
        l1 += i
        answer = numpy.sum(numpy.absolute(l1))
        print answer

Can anyone provide an answer or hint?

Exclu
  • 1
  • 2

1 Answers1

2

The trick is to first combine the two one-dimensional arrays into a single two-dimensional array, and then sum over that. If you have a vector of shape (a,1) and you broadcast it with an array of shape (b,), the resulting array will be shape (a,b). It's handy to add extra axes with length one into arrays to get this sort of behavior.

Here's a way to get the same answer without loops

# Assume l1 has length n1, l2 has length n2
suml2 = np.cumsum(l2)  # length n2
y = l1 + suml2[:,np.newaxis]  # shape (n2, n1)
answer = np.sum(np.abs(y), axis=1)  # shape (n2,)
clwainwright
  • 1,624
  • 17
  • 21
  • It works, but it is slightly slower, or it gives me a memory error when using large sets. (l1 and l2 about 100.000 large). This memory error comes from using `y = l1 + suml2[:,np.newaxis]`. It will create a 100.000 by 100.000 matrix I assume, which probably explains the memory error (I use 64-bits python) – Exclu Oct 20 '15 at 19:28
  • You might want to look into cython if this is a major bottleneck for you. Note that if you don't actually need the `abs` in there you could use algebra to rewrite it as `answer = np.sum(l1) + np.cumsum(l2)`. Numpy tends to be a memory hog when doing this sort of broadcasting and reduction. – clwainwright Oct 20 '15 at 19:55
  • That shorter expression (without `abs`) is actually (by my tests) `np.sum(l1) + len(l1)*np.cumsum(l2)`. In any case, without `abs` it's a simple linear calculation. – hpaulj Oct 20 '15 at 20:36
  • Thanks guys! I have just found out however that vectorizing this is not a good idea. Adding a integer to the list is not a hard operation, vectorizing this is not worth it. Got a different (and much faster/memory efficient) method now :) – Exclu Oct 21 '15 at 19:19
  • http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html The first example shows clearly that the above is in fact already broadcasted. – Exclu Oct 21 '15 at 19:21