0

I have data recorded automatically every minute. The data looks like this

enter image description here

The plot that I am trying to have should be like this.

enter image description here

I am trying to average each variable such that the final will have 1440 (60X24) time steps. In this way, I will able to plot the diurnal cycle of each variable.

import pandas as pd
import numpy as np
xdata = pd.read_csv("station_2019_2.csv")

xtime = pd.date_range("2019-10-01", "2019-11-01", freq="min")

ydata = xdata.drop(columns=["Date", "Time"])


df = pd.DataFrame(ydata)


df["Date"] = xtime[1:]

df.index = pd.to_datetime(df.index)

mHI  = df.resample('1Min')['Hi'].mean()

print(np.shape(mHI))

Unfortunately, is not working. Any help will be appreciated.

regards

karel
  • 5,489
  • 46
  • 45
  • 50

1 Answers1

0

When you read the csv file into a dataframe, pandas generates a default index consisting of consecutive integers. That's why your code

df["Date"] = xtime[1:]
df.index = pd.to_datetime(df.index)

isn't working. Pandas doesn't know how to convert the integer index to the datetime format. Instead, you could do this:

df.index = xtime[1:]

However, if I understand you correctly, no resampling is necessary, because you already have minutely frequency. So you may not even need the datetime index.

To average the values for each minute over several days, do not drop the time column, so that you can do this:

df_average_day = df.groupby('Time').mean()
Arne
  • 9,990
  • 2
  • 18
  • 28