0

I am very new to python so please bare with me.

So far I have a loop that identifies my netcdf files within a date range. I now need to calculate the daily averages and then the monthly averages for each month and add it to my dataframe so I can plot a time series.

Heres my code so far

# !/usr/bin/python
# MODIS LONDON TIMESERIES 

print ('Initiating AQUA MODIS Time Series')

import pandas as pd              
import xarray as xr
from glob import glob
import netCDF4 
import numpy as np
import matplotlib.pyplot as plt
import os

print ('All packages imported')

#list of files 
days = pd.date_range (start='4/7/2002', end='31/12/2002')

#create dataframe
df = final_data = pd.DataFrame(index = days, columns = ['day', 'night'])

print ('Data frame created')

#for loop iterating through %daterange stated in 'days' to find path using string 

for day in days:
    path = "%i/%02d/%02d/ESACCI-LST-L3C-LST-MODISA-LONDON_0.01deg_1DAILY_DAY-%i%02d%02d000000-fv2.00.nc" % (day.year, day.month, day.day, day.year, day.month, day.day)
    print(path)
  • Look for `resample` method in pandas and/or xarray documentation, it will do the job for you. – kakk11 Jan 10 '22 at 11:08
  • Welcome to SO. Please consider revising your question so that there is a minimum reproducible example. None of your code appears relevant to your question. – Robert Wilson Jan 10 '22 at 12:43

1 Answers1

1

Welcome to SO! As suggested, please try to make a minimal reproducible example.

If you are able to create an Xarray dataset, here is how to take monthly avearges

import xarray as xr

# tutorial dataset with air temperature every 6 hours
ds = xr.tutorial.open_dataset('air_temperature')

# reasamples along time dimension
ds_monthly = ds.resample(time='1MS').mean()

resample() is used for upscaling and downscaling the temporal resolution. If you are familiar with Pandas, it effectively works the same way.

What resample(time='1MS') means is group along the time and 1MS is the frequency. 1MS means sample by 1 month (this is the 1M part) and have the new time vector begin at the start of the month (this is the S part). This is very powerful, you can supply different frequencies, see the Pandas offset documentation

.mean() takes the average of the data over our desired frequency. In this case, each month.

You could replace mean() with min(), max(), median(), std(), var(), sum(), and maybe a few others.

Xarray has wonderful documentation, the resample() doc is here

Luke
  • 50
  • 5