1

Imagine the ohlc data indexed in DateTime. I am going to resample this dataframe on every nth day of the month.

for example:

.
.
.
2020-09-24  1.0990  1.1000  1.0982  1.0991
2020-09-25  1.1018  1.1025  1.0964  1.0995
2020-09-26  1.1011  1.1020  1.1009  1.1018
.
.
.
2020-10-24  1.1045  1.1068  1.0995  1.1017
2020-10-25  1.1031  1.1074  1.1021  1.1045
2020-10-26  1.1071  1.1076  1.1012  1.1031
.
.
.
2020-11-23  1.1005  1.1075  1.0989  1.1071
2020-11-26  1.1079  1.1086  1.0992  1.1005
2020-11-27  1.1076  1.1087  1.1068  1.1079
.
.
.
2020-12-24  1.1058  1.1110  1.1054  1.1071
2020-12-25  1.1010  1.1087  1.0926  1.1058
2020-12-26  1.1049  1.1056  1.0983  1.1010
.
.
.
2021-01-24  1.1049  1.1059  1.1029  1.1048
2021-01-25  1.1025  1.1068  1.1014  1.1049
2021-01-26  1.1025  1.1028  1.1022  1.1025

what I need is :

2020-09-25  1.1018  1.1025  1.0964  1.0995
2020-10-25  1.1031  1.1074  1.1021  1.1045
2020-11-25  1.1005  1.1075  1.0989  1.1071
2020-12-25  1.1010  1.1087  1.0926  1.1058
2021-01-25  1.1025  1.1068  1.1014  1.1049

In fact, I need to resample on every 25th day of the month and if there were no data for it, it must be filled with the nearest previous data.

2 Answers2

0
df[df.groupby(df['Date']+df['Date'].apply(lambda x: pd.DateOffset(days=25-x.day) if x.day<=25 else pd.DateOffset(days=25-x.day,months=1)))['Date'].transform(max)==df['Date']]
Bing Wang
  • 1,548
  • 1
  • 8
  • 7
0

Simplest solution would be as follows,

#Sort the dataframe
df = df.sort_values('date')

#Use ffill to fill nearst previous value (by timestamp) for Null elements
df = df.ffill(axis=0)

#Simpelly query by day
df.loc[df['date'].dt.day == float('25')]

Output:

date    a   b   c   d
1   2020-09-25  1.1018  1.1025  1.0964  1.0995
4   2020-10-25  1.1031  1.1074  1.1021  1.1045
10  2020-12-25  1.1010  1.1087  1.0926  1.1058
13  2021-01-25  1.1025  1.1068  1.1014  1.1049
k33da_the_bug
  • 812
  • 8
  • 16
  • The challenge in this question is not provided data. I mentioned above that "I need to resample on every 25th day of the month and if there were no data for it, it must be filled with the nearest previous data". in some cases there is no data but it must show in resampling by back fill. – Keyvan Iraji Jan 31 '21 at 11:50
  • @KeyvanIraji Have you tried running my solution ? if you want to back fill it just replace `ffill` with `bfill` it will do the job. – k33da_the_bug Jan 31 '21 at 12:37
  • Also I wanted to know if there is no 25th day present for certain timestamp do you want 24th day right ? or just want to fill 24th day's data into 25th and just display that ? – k33da_the_bug Jan 31 '21 at 12:44
  • Thank you for your response. Yes, I need 24th or 23th data for 25th if not found. like resampling function of dataframe for a month (resample('M').last()) do. – Keyvan Iraji Jan 31 '21 at 19:43