I have a netcdf file which contains a float array (21600, 43200). I don't want to read in the entire array to RAM because it's too large, so I'm using the Dataset object from the netCDF4 library to read in the array.
I would like to calculate the mean of a subset this array using two 1D numpy arrays (x_coords, y_coords) of 300-400 coordinates.
I don't think I can use basic indexing, because the coordinates I have aren't continuous. What I'm currently doing is just feeding the arrays directly into the object, like so:
ncdf_data = Dataset(file, 'r')
mean = np.mean(ncdf_data.variables['q'][x_coords, y_coords])
The above code takes far too long for my liking (~3-4 seconds depending on the coordinates I'm using), and I'd like to speed this up somehow. Is there a pythonic way that I can use to directly work out the mean from such a subset without triggering fancy indexing?