1

Pandas df 'inndata' with weekly interval data:

Date Item X Y Z
12-21-2021 aa -100 50 -50
bb 100 100 200
cc 300 -50 250
12-28-2021 aa 75 -50 25
bb -75 -50 -125
cc 50 100 150
... ... ... ... ...
12-20-2022 aa 200 55 255
bb -25 -50 -75
cc -100 100 0

Using pandas, how do I select rows at chosen intervals, for example 1 week and/or 1 year, while leaving out others, and report the data from column Z, for a result resembling the following table with historical data going back in time?

Item 1W 1YR
aa -50 255
bb 200 -75
cc 250 0

Unsuccessful attempt:

inndata['Z'] = pd.DataFrame({
              '1W': df.iloc[0,:]
             ,'1YR': df.iloc[-52,:]
            }).T
SJ1
  • 15
  • 4

1 Answers1

1

It's easiest using the own pandas tools. Assuming that inndata have a Pandas MultiIndex with Date and Item.

First, create a filter array, having most_recent_date obtained from Date values:

>>> columns = ['1W', '1YR']
>>> weeks = [1, 52]
>>> filter = [most_recent_date - pd.Timedelta(weeks=w) for w in weeks]

Then we can apply the filter to the level of Date and get the values.

>>> idx_level = inndata.index.get_level_values('Date')
>>> df = inndata.loc[idx_level.isin(filter)]

Now, sort the MultiIndex of df with Date descending,

>>> df.sort_index(level=['Date', 'Item'], ascending=[False, True], inplace=True)

replace it with columns names that we want, using a dict

>>> d = dict(zip(filter, columns))
>>> idx = df.index.levels[0].to_series().replace(d)
>>> df.index = df.index.set_levels(idx, level='Date')

and unstack the level Date. This convert it to columns

>>> df = df.unstack(level='Date')
>>> df.columns.rename(None, inplace=True)

Finally, we get the DataFrame with the resumed information.

Item 1W 1YR
aa -50 255
bb 200 -75
cc 250 0
acardozos
  • 51
  • 3