I have an input array, which is a masked array.
When I check the mean, I get a nonsensical number: less than the reported minimum value!
So, raw array: numpy.mean(A) < numpy.min(A)
. Note A.dtype
returns float32
.
FIX: A3=A.astype(float)
. A3 is still a masked array, but now the mean lies between the minimum and the maximum, so I have some faith it's correct! Now for some reason A3.dtype
is float64
. Why?? Why did that change it, and why is it correct at 64 bit and wildly incorrect at 32 bit?
Can anyone shed any light on why I needed to recast the array to accurately calculate the mean? (with or without numpy, it turns out).
EDIT: I'm using a 64-bit system, so yes, that's why recasting changed it to 64bit. It turns out I didn't have this problem if I subsetted the data (extracting from netCDF input using netCDF4 Dataset
), smaller arrays did not produce this problem - therefore it's caused by overflow, so switching to 64-bit prevented the problem.
So I'm still not clear on why it would have initially loaded as float32, but I guess it aims to conserve space even if it is a 64-bit system. The array itself is 1872x128x256
, with non-masked values around 300, which it turns out is enough to cause overflow :)