0

I found this problem when calculating array from rasters:

with rasterio.open(file) as ds:   
    arr3d=ds.read()
    arr3d=np.ma.masked_where(arr3d==-32768,arr3d,False)
    list=[]
    for i in range(0,24):
        tmean=arr3d[i,:,:].mean()  
        list.append(tmean)

I just wanted to get the list containing 24 mean values, but this code returned the list including each layer of arr3d, its mask layer and mean values.

 len(list)=72       

But when I tried arr3d[i,:,:].mean(), just retruned a mean value without any array. What is the differce between arr.mean() and np.mean(arr)?

Cobin
  • 888
  • 13
  • 25
  • I'm not sure how to read the question; what do you mean by "`arr3d[i,:,:].mean()` just returned a mean value without any array"? The mean value is exactly what this function should return. Regarding the difference between `arr.mean()` and `np.mean(arr)` there should not be any difference. [First thing the latter does is trying to call the former if `arr` is a masked array](https://github.com/numpy/numpy/blob/v1.12.0/numpy/core/fromnumeric.py#L2882). – MB-F Mar 27 '17 at 14:50
  • Can you create a [minimal, complete, and verifyable example](http://stackoverflow.com/help/mcve) that illustrates the discrepancy or unexpected behavior? – MB-F Mar 27 '17 at 14:51
  • I just tried again. It's OK. Both of `arr.mean()` and `np.mean(arr)` return same list with 24 elements in this loop. It is weird that `arr.mean()` return 72 elements. – Cobin Mar 28 '17 at 01:51

1 Answers1

1

np.mean() returns either: (1) a single value, if the mean is taken along the flattened array or the array is 1-dimensional, or (2) an array of values that has the mean along each of those axes. Because this is confusing, I recommend always explicitly passing the axis parameter to the np.mean() function. If you don't pass an axis, it takes the mean of the flattened array. Same is true for the .mean() function - they are in reality the same function.

I suggest explicitly passing the axis along which you want to compute the mean:

with rio.open(file) as ds:   
    arr3d=ds.read()
    arr3d=np.ma.masked_where(arr3d==-32768,arr3d,False)
    means = np.mean(arr3d, axis=0)

Then means will always have the same number of elements as in the first axis of arr3d. You are currently accomplishing this by manually iterating over 24 elements, but you can remove this step.

mprat
  • 2,451
  • 15
  • 33