I have a DataFrame df
with two columns x
and y
which I would like to plot as a line plot as follows:
import matplotlib.pyplot as plt
import seaborn as sns
fig = plt.figure(figsize=(9, 7))
ax = plt.subplot(111)
df = df.groupby(x, as_index=False).mean()
df = df.sort_values(x)
df[y] = df[y].rolling(1000).mean()
df = df.dropna()
sns.lineplot(data=df, x=x, y=y)
plt.tight_layout()
The resulting plot looks as follows:
As can be seen, there are much more data points with lower x-value, i.e. with increasing x-value there are less and less data points. Thus, using the rolling average with a fixed windows size of 1000 is averaging too many data points for big x-values and too little data points for low x-values.
Is there a possibility to make the window for the rolling average decreasing with larger x-value or adaptive to the number of data points? Or does for this kind of data a better approach than rolling average exist?