1

The unvectorized code reads:

import numpy as np
import numpy.ma as ma

np.random.seed(42)
H = np.random.uniform(0.1, 1.0, size=(6,8))
r, c = H.shape

mask = H.max(axis=1) > 0.95


x = np.linspace(0, 10, c)
weighted_averages = ma.masked_all((r,), dtype=H.dtype)

for i in range(r):
    if mask[i]:
        weighted_averages[i] = np.average(x, weights=H[i, :])

Here's my attempt at vectorizing it:


_, xx = np.mgrid[0:10:r*1j, 0:10:c*1j]
not_mask = np.logical_not(mask)


weighted_averages = np.average(xx, weights=H, axis=1)
mwa = ma.masked_array(weighted_averages, mask=not_mask)

It works, in the sense that the outputs are the same, but I'm "cheating" because I first compute all the averages and then mask the "unwanted" values. How could I avoid the unnecesary computations? I'm guessing I have to somehow mask xx, H, or both.

redhotsnow
  • 85
  • 8

1 Answers1

1

How about this -

import numpy as np
import numpy.ma as ma

np.random.seed(42)
H = np.random.uniform(0.1, 1.0, size=(6,8))
r, c = H.shape

mask = H.max(axis=1) > 0.95

x = np.linspace(0, 10, c)

H_mask = H[mask]
wa = (np.sum(x * H_mask, axis=1))/np.sum(H_mask, axis=1)
weighted_averages = ma.masked_all((r,), dtype=H.dtype)

weighted_averages[mask] = wa

Simply mask the array first and then take the averages. I don't think that you can use np.average here for this becasue it doesn't seem to support broadcasting. Hence, simply do the mean manually.

Ananda
  • 2,925
  • 5
  • 22
  • 45
  • Instead of `H_mask = H[mask]`, how could I mask `H` directly? I mean create a masked array, where the rows that satisfy `H.max(axis=1) <= 0.95` are masked? I tried `np.ma.masked_less_equal(H, 0.95)` but that is not equivalent, as I want to mask the rows where the *maximum along that row* is `<= 0.95`. – redhotsnow Jan 26 '21 at 00:37
  • I am not sure why you would need that. This solves your original issue right? It creates the array that you needed without spending any extra wasted time on computing the stuff that you do not need. – Ananda Jan 26 '21 at 04:54
  • 1
    But if you want that for some reason, you can do `H_mask = ma.masked_all((r,c), dtype=H.dtype)` and `H_mask[mask] = H[mask]` – Ananda Jan 26 '21 at 04:57