2

I noticed a strange behaviour of the pandas package, that leads to an unexpected failure to add time offsets in some cases.

Suppose I have the following dataframe:

df = pd.DataFrame({'time': ['2022-01-24', '2022-02-24', '2022-03-24'], 
                   'value': [10, 20, 30]})

I can successfully add a time offset to it using this syntax:

df.set_index(['time'], inplace=True)
df.index = pd.to_datetime(df.index, format='%Y-%m-%d')
df.index = df.index + pd.offsets.DateOffset(years=100)

But there is a fail, when I want to add the offset only to a subset of the dataframe, e.g. only to dates after 2022-02-25, see below:

df.set_index(['time'], inplace=True)
df.index = pd.to_datetime(df.index, format='%Y-%m-%d')
df[df.index>pd.to_datetime('2022-02-25')].index = df[df.index>pd.to_datetime('2022-02-25')].index + pd.offsets.DateOffset(years=100)

The second code slip leads to no change in the column time of df. Why nothing changes when I perform the adding only to a slice? And how do I successfully do it? Tnx

NeStack
  • 1,739
  • 1
  • 20
  • 40

1 Answers1

2

You can try set the whole index with new values (not just only part of it, if the index is sorted):

mask = df.index > pd.to_datetime("2022-02-25")

df.index = (
    *df[~mask].index,
    *(df[mask].index + pd.offsets.DateOffset(years=100)),
)

print(df)

Prints:

            value
2022-01-24     10
2022-02-24     20
2122-03-24     30
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • Thank you, this did the trick! But can you explain in your answer as well why my code snippet didn't work? And what are you doing differently to make it work? – NeStack Aug 07 '23 at 19:25
  • @NeStack I guess you're setting only copy of values of original index, so the new values aren't propagated to the whole dataframe. – Andrej Kesely Aug 07 '23 at 19:40