For xarray, is there a fast method for grib files?

Question

I am running into an issue. I have multiple hourly grib files for muiltiple years over 180 days each. it takes 3.5 seconds to load a file into xarray. Tested 9 days

import dask
import glob
import cfgrib
import numpy as np
import xarray as xr
import pandas as pd
import netCDF4 as nc
from scipy import spatial
from datetime import datetime
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import matplotlib.ticker as ticker
from dask.distributed import Client
import matplotlib.patheffects as path_effects

client = Client()
client

filenames = glob.glob("/2017100*/*")
dsmerged = xr.open_mfdataset(
                      filenames,
                      #compat='override',
                      combine = 'nested',
                      concat_dim ='valid_time',
                      engine="cfgrib",)

CPU times: user 12min 34s, sys: 1min 15s, total: 13min 50s Wall time: 12min 37s. Load time is an issue. I am also looking into resources for methods to convert grib to netcdf https://confluence.ecmwf.int/display/OIFS/How+to+convert+GRIB+to+netCDF. However is there a python method to just load the grib directly faster??? if not a method which converts cleanly to netcdf in python? I could end up using NCL CDO and NCO but I would rather tease out a python method first. I am currently looking around stackoverflow and many other resources as well.

Edit

I may do something in bash instead along the lines of:

for file in *.grib
do
ncl_convert2nc $file
ncks lat lon
done

how many files are you trying to load and what is the dimension of each of them? By the way, I always use grib_to_netcdf to convert the grib to nc. — Matteo De Felice, Jan 08 '22 at 18:25
They are hourly files with time, pressure levels, lat and lon. each file is an hourly file. So I have multiple years ~6 with a period of 180 days . So each year is 4320 files. Each grib file is ~19MBs. — user17091536, Jan 08 '22 at 19:28
with data files with a very long temporal dimension `cfgrib` seems to be slower than NetCDF (you can see a lot of discussion on the Github repo, see here for example https://github.com/ecmwf/cfgrib/issues/142). Try using the latest build and checking the known and solved issues on github. Otherwise, I would convert them to NetCDF... — Matteo De Felice, Jan 10 '22 at 09:22
Thanks for the reply! yeah i will convert them to NC files then load them. — user17091536, Jan 10 '22 at 17:24

For xarray, is there a fast method for grib files?

0 Answers0