1

I have two numpy arrays of pretty large size. First is arr1 of size (40, 40, 3580) and second is arr2 of size (3580, 50). What I want to achieve is

arr_final = np.sum(arr1[..., None]*arr2, axis = 2)

such that the size of arr_final is just (40, 40, 50). However, in doing the above, python probably caches internal array operations, so I keep on getting memory error. Is there any way so as to avoid internal caching and just have final result? I have looked at numexpr, but I am not sure how one can achieve arr1[..., None]*arr2, and then sum over axis=2 in numexpr. Any help or suggestion would be appreciated.

konstant
  • 685
  • 1
  • 7
  • 19
  • 1
    You are broadcasting a (40,40,3580,1) against (1,1,3580,50), producing (40,40,3580,50) array. Then you'll sum, reducing that to (40,40,50). But you still have to have spare to the big intermediate value. You could try iterating on the size 50 dimension. 50 loops on a large task should be ok time wise. – hpaulj May 14 '18 at 20:22
  • 2
    This temporary array is big, but it should be like 2GB big, not `MemoryError` big. Unless your dtype is `object` (in which case: don't do that), or you're using 32-bit Python (dittp), or you're on an embedded platform (OK, that's a problem). It's still worth trying to optimize, but it's _also_ worth trying to figure out how you're running out of memory. – abarnert May 14 '18 at 20:24
  • That kind of looks like it's just `dot`. – user2357112 May 14 '18 at 20:25
  • `arr1[:, None]*arr2` doesn't actually work. Did you mean `arr1[..., None]*arr2`? – user2357112 May 14 '18 at 20:27
  • @user2357112 yeah it is `arr1[..., None]*arr2`. – konstant May 14 '18 at 20:45
  • @abarnert Yes I'm using 32-bit Python – konstant May 14 '18 at 20:59
  • OK. Any reason you _need_ to use 32-bit Python? What platform are you on? – abarnert May 14 '18 at 21:00

1 Answers1

3

Assuming you meant np.sum(arr1[..., None]*arr2, axis = 2), with a ... instead of a :, then that's just dot:

arr3 = arr1.dot(arr2)

This should be more efficient than explicitly materializing arr1[..., None]*arr2, but I don't know exactly what intermediates it allocates.

You can also express the computation with einsum. Again, this should be more efficient than explicitly materializing arr1[..., None]*arr2, but I don't know exactly what it allocates.

arr3 = numpy.einsum('ijk,kl', arr1, arr2)
user2357112
  • 260,549
  • 28
  • 431
  • 505