I am running into an issue. I have multiple hourly grib files for muiltiple years over 180 days each. it takes 3.5 seconds to load a file into xarray. Tested 9 days
import dask
import glob
import cfgrib
import numpy as np
import xarray as xr
import pandas as pd
import netCDF4 as nc
from scipy import spatial
from datetime import datetime
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import matplotlib.ticker as ticker
from dask.distributed import Client
import matplotlib.patheffects as path_effects
client = Client()
client
filenames = glob.glob("/2017100*/*")
dsmerged = xr.open_mfdataset(
filenames,
#compat='override',
combine = 'nested',
concat_dim ='valid_time',
engine="cfgrib",)
CPU times: user 12min 34s, sys: 1min 15s, total: 13min 50s Wall time: 12min 37s. Load time is an issue. I am also looking into resources for methods to convert grib to netcdf https://confluence.ecmwf.int/display/OIFS/How+to+convert+GRIB+to+netCDF. However is there a python method to just load the grib directly faster??? if not a method which converts cleanly to netcdf in python? I could end up using NCL CDO and NCO but I would rather tease out a python method first. I am currently looking around stackoverflow and many other resources as well.
Edit
I may do something in bash instead along the lines of:
for file in *.grib
do
ncl_convert2nc $file
ncks lat lon
done