1

I'm using ncap2 to generate the mean and standard deviation of a given variable (ex: nsmz) from a given file (ex: NB_all_var_surface_04750.nc).

I would like to store the output values generated by ncdump as objects; right now, I am having to manually copy each output in terminal over to a numpy array in a separate Jupyter Notebook. There has to be a faster way!

Please see the shell commands below; any thoughts?

Edit: For clarity, my intention here was to append the value of nsmz_mean in a list, or otherwise name the value of nsmz_mean "object1" for later reference (because I have many files with many unique values). These are pythonic wishes, and I was able to find an answer by moving outside of bash.

ncap2 -h -O -s  'nsmz_mean=nsmz.avg();' NB_all_var_surface_0*.nc NB_nsmz_surface_mean.nc
ncdump -v nsmz_mean NB_nsmz_surface_mean.nc | sed -e '1,/data:/d' -e '$d'

$ nsmz_mean = 1.473794e-07 ;

ncap2 -s 'nsmz_sdn=(nsmz-nsmz.avg(ocean_time)).rmssdn(ocean_time)' NB_all_var_surface_0*.nc NB_nsmz_surface_sdn.nc
ncdump -v nsmz_sdn NB_nsmz_surface_sdn.nc | sed -e '1,/data:/d' -e '$d'

$ nsmz_sdn = 1.247375e-07 ;
  • Hi Lindsay, can 'I suggest you edit the question a little to make it clear what exactly you are asking? From the python answer you post it seems you just want to get the output into a variable, which you could also do in bash using variable=$(command), but from what you write it seems that you want to have the output displayed in a jupyter notebook? Perhaps clarify ? – ClimateUnboxed Oct 15 '19 at 07:32
  • Hi Adrian, I edited it for some clarity, and added a solution below. – Lindsay Veazey Oct 15 '19 at 21:16

1 Answers1

1

Please note that the answer posted by lindsay to her own question will unfortunately not give the correct answer, apart from the exceptional circumstance of the data on a cartesian grid. In most cases the projection will be regular lat-lon, or reduced Gaussian or one of many regional map projections and in this case a simple arithmetic mean over the numpy array is incorrect as it fails to account for the grid-cell size weighting (e.g. cells get "narrower" towards the poles in a regular lat-lon grid). The larger the area you average over, the worse this error gets. See e.g. my answer posted here: Calculate variables mean in a selective area , in gridded netCDF file

I presume ncap2 calculate the spatial mean accounting for the grid, I know that CDO definitely does, so it would be better to loop over the files first in bash

for file in NB_all_var_surface_*.nc ; do 
    # appends _mean to each file name for output
    cdo fldmean $file $file%???}_mean.nc 
done

and then in your python program just loop over the files:

from glob import glob
from netCDF4 import Dataset
import numpy as np

files=glob("NB_all_var_surface_*mean.nc")
for file in files:
    ds = Dataset(test)
    mymean = ds.variables['nsmz'] 

    ...append to numpy array etc etc 

The second option is to use the python module xarray, as this is specifically written with netcdf data structures in mind, here is a link to the manual describing mathematical operations: http://xarray.pydata.org/en/stable/computation.html

ClimateUnboxed
  • 7,106
  • 3
  • 41
  • 86
  • Thank you for your correction, Adrian! I had seen xarray touted around the netCDF communities, but never bothered to look at it- to my disadvantage. I'm using it now to extract the values I need. I deleted my erroneous answer to avoid misleading anyone. I was unaware of the distortion issue because my gridded area is very small relative to the globe, so I was getting a the same mean extracted by ncap2 as noted in the now-deleted answer. I think this would change if my total grid expanse increased. – Lindsay Veazey Oct 16 '19 at 23:21
  • hi Lindsay, yes that's right, if you are using a very small area, the error is limited, or also if you are on a lat-lon in the tropics it is not overly incorrect. :-) I hope my answer was helpful – ClimateUnboxed Oct 17 '19 at 06:39