I will start with a simpler example for better understanding:
b = np.ma.masked_where(np.arange(20)>-1,np.arange(20))
#b: [-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --]
#b.data: [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
c = np.zeros(b.shape)
#c: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
d = np.zeros(b.shape)
#d: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
c += b
#c: [ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.]
d = d + b
#d: [-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --]
#d.data: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
The first operation c += b
is an in-place operation. In other words, it is equivalent to c = type(c).__iadd__(c, b)
which does the addition according to type of c
, which is not a masked array, hence the data of b
used as unmasked.
On the other hand, d = d + b
is equivalent to d = np.MaskedArray.__add__(d, b)
(to be more particular, since masked arrays are a subclass of ndarrays, it uses __radd__
) and is NOT an in-place assignment. This means it creates a new object and uses the wider type on the right hand side of the equation when adding and hence converts d (which is an unmasked array) to a masked array (because b
is a masked array), therefore the addition uses valid values only (which in this case there is none since ALL elements of b
are masked and invalid). This results in a masked array d
with same mask as b
while the data of d
remains unchanged.
This difference in behavior is not Numpy specific and applies to python itself too. The case mentioned in the question by OP has similar behavior, and as @alaniwi mentioned in the comments, the boolean indexing with mask a
is not fundamental to the behavior. Using a
to mask elements of b
, c
, and d
is only limiting the assignment to masked elements by a
(rather than all elements of arrays) and nothing more.
To makes things a bit more interesting and in fact clearer, lets switch the places of b
and d
on the right hand side:
e = np.zeros(b.shape)
#e: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
e = b + e
#e: [-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --]
#e.data: [ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.]
Note that, similar to d = d + b
, the right hand side uses masked array __add__
function, so the output is a masked array, but since you are adding e
to b
(a.k.a e = np.MaskedArray.__add__(b, e)
), the masked data of b
is returned, while in d = d + b
, you are adding b
to d
and data of d
is returned.