I am trying to convert a nested loop over a numpy array into a numpy-optimized implementation.
The function being called inside the loop takes a 4D vector and a separate parameter, and outputs a 4D vector which is supposed to replace the old 4D vector based on operations with the new value. If relevant, the function a Welford online update which updates mean and standard deviation based on a new value, with the 4D vector being [old_mean, old_std, old_s, num_values]
. For each pixel channel, I am saving these values in the history_array
for updating the distribution based on future pixel values.
My present code looks like this:
def welford_next(arr:np.ndarray, new_point:np.float32) -> np.ndarray:
old_mean, _, old_s, num_points = arr
num_points += 1
new_mean = old_mean + (new_point - old_mean) / num_points
new_s = old_s + (new_point - old_mean) * (new_point - new_mean)
return [new_mean, np.sqrt(new_s / num_points) if num_points > 1 else new_s, new_s, num_points]
updates = [10., 20., 30., 40., 90., 80.]
history_array = np.zeros(shape = b.shape + (4,)) # shape: [6,3,3,4]
print(f'History Shape: {history_array.shape}')
history_array_2 = np.zeros_like(history_array)
for update in updates:
image = np.empty(shape = b.shape) # shape: [6,3,3] (h x w x c)
image.fill(update)
for i, row in enumerate(image): # Prohibitively expensive
for j, col in enumerate(row):
for k, channel in enumerate(col):
history_array[i][j][k] = welford_next(history_array[i][j][k], channel)
history_array_2 = np.apply_along_axis(welford_next, axis=2, arr=history_array_2)
print(history_array == history_array_2)
However, the np.apply_along_axis()
is not seem to be viable because it does not allow additional parameters to be passed alongside the array itself.I also came across np.ufunc
which the welford_next()
function can be converted to using np.frompyfunc()
but it is unclear how it could help me reach the desired target.
How do I achieve this looped operation using numpy?