I am using the scipy simpsons rule in a vectorized way for performance. (I am calculting integrals on a rolling window on a timeseries.)
For some reason I am getting a memory error when using simps on an 500MB array of floats on a 64bit machine which has 4GB of memory available.
Here's some example code:
import numpy as np
from scipy.integrate import simps
import psutil
print(psutil.virtual_memory().percent) #gives 40%
# in the followinging, the number of rows can be increased up until 190'000 without issues.
# The memory usage during execution of the whole script grows linearly with the size of the input array
# I don't observe any exponential behaviour in Memory-usage
a = np.random.rand(195000, 150, 2)
print(psutil.virtual_memory().percent) #gives 51%
print(a.nbytes/1024/1024, "MB") #gives 446MB
print(a[:,:,0].nbytes/1024/1024, "MB") #gives 223 MB
I = simps(a[:,:,0], a[:,:,1]) #MemoryError
print(psutil.virtual_memory().percent)
I can see the virtual memory of my machine going up until 2GB are used and then I get a MemoryError even though not all of my free memory is used.
So I am confused: Why am I getting a MemoryError even though there is still about 2GB of unused Memory available? Can this be worked around?
I am using Winddows Server 2012 R2.
Edit: To illustrate the more or less linear scaling of memory usage vs. input size, I performed the following little experiment:
def test(rows):
a = np.random.rand(rows, 150, 2)
I = simps(a[:,:,0], a[:,:,1])
print(I.shape)
for numRows in [r for r in range(10000,190000,10000)]:
test(numRows)
Which gives a rising memory consumption in the windows ressource monitor as well as in other tools like mprof: