2

I have been testing the CuPy library and done a simple matrix multiplication using einsum:

C = cp.einsum('pqrs,rs->pq', A, B)

Dimensions of A and B are, (41, 41, 41, 41) (41, 41), receptively. I also checked their sizes, which are 22606088 bytes, 13448 bytes.

While running the code, I am getting the following error message:
OutOfMemoryError: out of memory to allocate 38000834048 bytes (total 38023468032 bytes)

It indicates that I am running out of memory. Is there any option to sent data partially to the device and perform operations in terms of batches?

talonmies
  • 70,661
  • 34
  • 192
  • 269
EveSz
  • 61
  • 6

1 Answers1

0

I think there is no option to send data partially for one-array.

And I faced same issue before, this may be caused because the cupy einsum efficiency is not optimized yet. https://github.com/cupy/cupy/issues/19#issuecomment-322972682

If you can try replacing your einsum function by using transpose, reshape and matmul etc, please try those.

I guess

C = cp.einsum('pqrs,rs->pq', A, B)

is equivalent to

p, q, r, s = A.shape
A = cp.reshape(A, (p, q, r*s))
B = cp.reshape(B, (1, 1, r*s))
C = cp.sum(A * B, axis=2)
corochann
  • 1,604
  • 1
  • 13
  • 24
  • You are right, this way works. It looks the CuPy einsum is not optimized. I also noticed that this library does not work if the available memory is excited, what is a huge drawback. Well, we know that GPU cards do not provide to much memory. Have you tried to go around it? – EveSz Nov 17 '18 at 16:03
  • 1
    I think `einsum` implementation is updated, I don't know which cupy version you are using but latest version may work more efficiently. – corochann Nov 18 '18 at 13:21