I am trying to download ERA5 reanalysis data(global)from python code but it shows data is too large

Question

I want to download ERA5 reanalysis hourly data for global (from 1950 to 2022). I am trying to download by using python code from cdsapi.

I am just trying to download only for 5 years (in each year only 3 months) but it shows failed as the request is too large.

This is my code:

import cdsapi

c = cdsapi.Client()

c.retrieve(
    'reanalysis-era5-pressure-levels',
    {
        'product_type': 'reanalysis',
        'variable': [
            'specific_humidity', 'temperature', 'u_component_of_wind',
            'v_component_of_wind',
        ],
        'pressure_level': '500',
        'year': '1979',
        'month': [
            '01', '02', '12',
        ],
        'day': [
            '01', '02', '03',
            '04', '05', '06',
            '07', '08', '09',
            '10', '11', '12',
            '13', '14', '15',
            '16', '17', '18',
            '19', '20', '21',
            '22', '23', '24',
            '25', '26', '27',
            '28', '29', '30',
            '31',
        ],
        'time': [
            '00:00', '01:00', '02:00',
            '03:00', '04:00', '05:00',
            '06:00', '07:00', '08:00',
            '09:00', '10:00', '11:00',
            '12:00', '13:00', '14:00',
            '15:00', '16:00', '17:00',
            '18:00', '19:00', '20:00',
            '21:00', '22:00', '23:00',
        ],
        'format': 'netcdf',
    },
    'era1979.nc')

ClimateUnboxed · Answer 1 · 2023-01-16T14:19:03.670

You need to break up the retrieval into smaller chunks, for example by looping over the months, and having one file per month. To see how to do this, I have a youtube video on my channel precisely on this topic.

so for your example (if this is still not enough you will need to do the same over the days):

import cdsapi

c = cdsapi.Client()
for year in [str(y) for y in range(1979,1985)]:
    for mon in ['01', '02', '12']:
        c.retrieve(
        'reanalysis-era5-pressure-levels',
        {
        'product_type': 'reanalysis',
        'variable': [
            'specific_humidity', 'temperature', 'u_component_of_wind',
            'v_component_of_wind',
        ],
        'pressure_level': '500',
        'year': year,
        'month': mon,
        'day': [
            '01', '02', '03',
            '04', '05', '06',
            '07', '08', '09',
            '10', '11', '12',
            '13', '14', '15',
            '16', '17', '18',
            '19', '20', '21',
            '22', '23', '24',
            '25', '26', '27',
            '28', '29', '30',
            '31',
        ],
        'time': [
            '00:00', '01:00', '02:00',
            '03:00', '04:00', '05:00',
            '06:00', '07:00', '08:00',
            '09:00', '10:00', '11:00',
            '12:00', '13:00', '14:00',
            '15:00', '16:00', '17:00',
            '18:00', '19:00', '20:00',
            '21:00', '22:00', '23:00',
        ],
        'format': 'netcdf',
    },
    'era_'+year+'_'+mon+'.nc')

Another way to do the loops is to simply use the range command and then you need to convert it to a string when ever you use it

for year in range(1979,1985):
    file='era_'+str(year)+'.nc'

etc

I am trying to download ERA5 reanalysis data(global)from python code but it shows data is too large

1 Answers1