I have a time series of rain intensity (in µm/s), which I resample to 1 minute intervals. The data already has a 1 minute time step, but I may have data outage due to quality checks or basic equipment failure. The resample ensures that I have a consistent, equidistant time series to loop over, which is fastest for me so far.
The problem is that in theory I can choose another time step for the calculation, say 5 minutes. I have found that this gives larger dimensions for a rainwater basin, which was odd to me. I figured out that it is because the sum of the resample systematically gives higher values, i.e. more precipitation -> larger basin.
How is it that resample gives this odd result? Is it because it can take the same time steps and account for them in different resampled time steps...?
import pandas
import numpy
import datetime
import matplotlib
from matplotlib import pyplot as plt
data1 = pandas.read_csv("rain_1min.txt", sep=";", parse_dates=["time"], index_col="time")
test = list(range(1,121))
sums = []
for timestep in test:
data_rs = data1["rain"].resample(f"{timestep}Min").mean().replace("nan", 0.0)
sums.append(numpy.nansum(data_rs))
fig, ax = plt.subplots(figsize=[8,4], dpi=100)
ax.plot(test, sums)
ax.set_xlabel("Rule = x Min")
ax.set_ylabel("Sum of mean()")