0

I have a dataframe with date, previous_day, and price.

DATE       | prev_day |  price
01-01-2018 |    1     |  44
02-01-2018 |    1     |  45
02-01-2018 |    2     |  41
03-01-2018 |    1     |  54
03-01-2018 |    2     |  49
03-01-2018 |    3     |  46

I would like to create the next day and prior day using the next step:

DATE       | prev_day | price | next_day | prior_day
01-01-2018 |    1     |  44   |          |    
02-01-2018 |    1     |  45   |          |    
02-01-2018 |    2     |  41   |          |    
03-01-2018 |    1     |  54   |          |    
03-01-2018 |    2     |  49   |          |
03-01-2018 |    3     |  46   |          |

Where the logic consists of each next_day equals the price with DATE + 1 day and prev_day + 1 day, and prior_day equals the price with DATE - 1 day and prev_day - 1 day.

1 Answers1

2

IIUC, you can groupby "prev_day" and shift:

df['prior_day'] = df.groupby('prev_day')['price'].shift()
df['next_day'] = df.groupby('prev_day')['price'].shift(-1)

output:

         DATE  prev_day  price  prior_day  next_day
0  01-01-2018         1     44        NaN      45.0
1  02-01-2018         1     45       44.0      54.0
2  02-01-2018         2     41        NaN      49.0
3  03-01-2018         1     54       45.0       NaN
4  03-01-2018         2     49       41.0       NaN
5  03-01-2018         3     46        NaN       NaN
mozway
  • 194,879
  • 13
  • 39
  • 75
  • 2
    @MuriloBarbosa But does the solution really do what you requested? According to your instructions, for the row with `DATE==02-01-2018` and `prev_day==2`, `prior_day` should match the price of `DATE-1` (01-01-2018) AND `prev_day-1`=1 which should match the price=44. However, this solution returns NaN! – normanius Sep 08 '21 at 12:58
  • @Normanius as DATE == 02-01-2018 and prev_day==2 does not exist in the dataframe, the output will actually be NaN... just like in the case of DATE == 03-01-2018 and prev_day == 3 does not exist in the dataframe, the output will really be NaN – Murilo Barbosa Sep 08 '21 at 13:26
  • 1
    @MuriloBarbosa I'm confused. The row DATE = 02-01-2018 and prev_day=2 DOES exist in above data frame! And also the row with DATE-1 AND prev_day-1 (01-01-2018, 1) as well as DATE+1 AND prev_day+1 (03-01-2018, 3) both exist in the above data frame. The instructions just don't make sense to me. Anyhow, +1 for this answer, despite all lack of clarity. I'm muted. – normanius Sep 08 '21 at 13:36
  • to get the price value of DATE = 02-01-2018 and prev_day=2 is calculated as follows: DATE - prev_day where the result is 31-01-2018 and this value does not exist in the dataframe, so it receives the value NaN. Was the explanation clearer? – Murilo Barbosa Sep 08 '21 at 14:01
  • @MuriloBarbosa Okay, now it's clearer. But compare your explanation with your original question above. It's not the same. Plus: The accepted answer here does not compute what you think it does. Try with different data... – normanius Sep 08 '21 at 22:30