6

The "vectorizing" of fancy indexing by Python's numpy library sometimes gives unexpected results. For example:

import numpy
a = numpy.zeros((1000,4), dtype='uint32')
b = numpy.zeros((1000,4), dtype='uint32')
i = numpy.random.random_integers(0,999,1000)
j = numpy.random.random_integers(0,3,1000)

a[i,j] += 1
for k in xrange(1000):
    b[i[k],j[k]] += 1

Gives different results in the arrays 'a' and 'b' (i.e. the appearance of tuple (i,j) appears as 1 in 'a' regardless of repeats, whereas repeats are counted in 'b'). This is easily verified as follows:

numpy.sum(a)
883
numpy.sum(b)
1000

It is also notable that the fancy indexing version is almost two orders of magnitude faster than the for loop. My question is: "Is there an efficient way for numpy to compute the repeat counts as implemented using the for loop in the provided example?"

Dason
  • 60,663
  • 9
  • 131
  • 148
  • semantics are different: for-loop works in sequence (values for repeated indexes are incremented multiple times), but `a += 1` is equivalent to `a = a + 1` (increments once for repeated indexes). – jfs Jun 12 '12 at 17:26
  • I know there is a difference in semantics (I should not have said 'unexpected' in my original post), but is there an efficient way to compute the for-loop semantics (i.e. count occurences of (i,j) tuples) using numPy? – user1451766 Jun 12 '12 at 17:38
  • lazy option is to use Cython. It might be both faster and more readable than numpy function-based version. Though the later forces to think about the operation in more general terms (it might be a good thing) and the resulting solution could be more extendable e.g., http://stackoverflow.com/questions/4962606/fast-tensor-rotation-with-numpy – jfs Jun 12 '12 at 18:35

1 Answers1

6

This should do what you want:

np.bincount(np.ravel_multi_index((i, j), (1000, 4)), minlength=4000).reshape(1000, 4)

As a breakdown, ravel_multi_index converts the index pairs specified by i and j to integer indices into a C-flattened array; bincount counts the number of times each value 0..4000 appears in that list of indices; and reshape converts the C-flattened array back to a 2d array.

In terms of performance, I measure it at 200 times faster than "b", and 5 times faster than "a"; your mileage may vary.

Since you need to write the counts to an existing array a, try this:

u, inv = np.unique(np.ravel_multi_index((i, j), (1000, 4)), return_inverse=True)
a.flat[u] += np.bincount(inv)

I make this second method a little slower (2x) than "a", which isn't too surprising as the unique stage is going to be slow.

ecatmur
  • 152,476
  • 27
  • 293
  • 366
  • I considered using bincount(). However, my array is actually much larger than the in example I provided (>300*10**6). And the number of updates (i,j) pairs is relatively smaller (around 20*10**6) with many repeats. I am afraid that the bincount approach will use too much memory. – user1451766 Jun 12 '12 at 17:18
  • @user1451766 - `bincount` won't use any more memory than you need to construct `a = numpy.zeros(...)` in your original post. – ecatmur Jun 12 '12 at 17:22
  • In my case, 'a' is a memmap() file. Neither of the fancy-indexing or for-loop approaches requires an intermediate memory buffer that is as large. – user1451766 Jun 12 '12 at 17:29
  • However, your `ravel_multi_index` in combination with `unique( ,return_inverse=True)` might do the trick! – user1451766 Jun 12 '12 at 17:45
  • I added another method, is that what you were after? – ecatmur Jun 13 '12 at 09:32
  • BTW, why use `ravel_multi_index()` instead of `i*4+j`? – user1451766 Jun 13 '12 at 13:28
  • It's more explicit and is easier to adapt if `a` changes shape esp. with higher dimensions (should actually use `np.ravel_multi_index((i, j), a.shape`). – ecatmur Jun 13 '12 at 13:36
  • you might mean: "second method a little slower (2x) than `a`", not `b` – jfs Jun 13 '12 at 14:12