I'm trying to optimize a method in numpy, evaluating a gaussian function at many positions in an image. Right now I'm calculating boolean masks around the center positions of every gaussian and using them for indexing the data array, in order to save computation time. For a small amount of data points, boolean indexing greatly accellerated the script in comparison to calculating the indices via np.argwhere
.
Unfortunately, it seems the boolean indexing seems to cause a memory leak. The following example causes my RAM to max out. How do I prevent this? Is this a numpy bug, or am I doing something wrong? I think this might be because the boolean indexing creates copies of the data, as someone wrote in this answer. Shouldn't the memory be freed after the copy is not needed anymore? Is there a way to manually free the memory?
import numpy as np
import matplotlib.pyplot as plt
class Image():
def __init__(self, nx, ny, px):
x = np.linspace(0, nx * px, nx)
y = np.linspace(0, ny * px, ny)
self.x, self.y = np.meshgrid(x, y)
self.data = np.zeros((nx, ny), dtype=np.uint8)
self.nx = nx # number of pixels
self.ny = ny
self.px = px # pixel size [mm]
def gaussian(self, x, y, x0, y0, amplitude=None, sigma=0.01):
""" 2d gaussian function
Parameters:
x, y: coordinates of evaluation position
x0, y0: center coordinates of Gaussian
amplitude: amplitude of the Gaussian
sigma: width of Gaussian
"""
if amplitude is None:
amplitude = np.iinfo(self.data.dtype).max
result = amplitude * np.exp(
-((x - x0)**2 + (y - y0)**2) / (2 * sigma ** 2))
return result.astype(self.data.dtype)
def mask_from_pos(self, x0, y0, radius_px):
""" returns a square mask around the position (x0, y0)
"""
masks = np.zeros((x0.size, self.ny, self.nx), dtype=bool)
i_center_x = (x0 // self.px).astype(int)
i_center_y = (y0 // self.px).astype(int)
for ix, iy, mask in zip(i_center_x, i_center_y, masks):
y_start = max(0, min(iy - radius_px, self.ny - 1))
y_end = max(0, min(iy + radius_px + 1, self.ny))
x_start = max(0, min(ix - radius_px, self.nx - 1))
x_end = max(0, min((ix + radius_px + 1), self.nx))
mask[y_start: y_end, x_start: x_end] = True
return masks
def add_gaussians(self, posx, posy, radius_px=9):
masks = self.mask_from_pos(posx, posy, radius_px)
for x0, y0, mask in zip(posx, posy, masks):
self.data[mask] += self.gaussian(
self.x[mask], self.y[mask], x0, y0).astype(self.data.dtype)
if __name__ == '__main__':
image = Image(2000, 2000, 0.005)
xy_max = 2000 * 0.005
x0 = np.random.rand(3000) * xy_max
y0 = np.random.rand(3000) * xy_max
image.add_gaussians(x0, y0)
plt.figure(dpi=300, figsize=(8, 8))
plt.imshow(image.data,
cmap=plt.get_cmap('gray'),
extent=[0, xy_max, 0, xy_max])
I already tried debugging this and none of the python objects seems to grow, so I assume this is a problem in numpy, but I'm not 100% certain. Any help with this is greatly appreciated! Thanks in advance.
Edit 1: Workaround
I found a solution that gives the same result but without flooding the memory. Simply replacing the calculation of a mask-array by a mask-generator did the trick:
def mask_from_pos_gen(self, x0, y0, radius_px):
""" yields a square mask around the position (x0, y0)
"""
i_center_x = (x0 // self.px).astype(int)
i_center_y = (y0 // self.px).astype(int)
for ix, iy in zip(i_center_x, i_center_y):
mask = np.zeros(self.data.shape, dtype=bool)
y_start = max(0, min(iy - radius_px, self.ny - 1))
y_end = max(0, min(iy + radius_px + 1, self.ny))
x_start = max(0, min(ix - radius_px, self.nx - 1))
x_end = max(0, min((ix + radius_px + 1), self.nx))
mask[y_start: y_end, x_start: x_end] = True
yield mask
This method can then be used exactly the same way as mask_from_pos before, but the memory is freed this way. Leaving this here in case someone else has the same problem and hoping some numpy Guru will explain this behavior.