I have a pandas dataframe, df:
df = pd.DataFrame({
'sku': ['A', 'A', 'B', 'B', 'C', 'C'],
'date': ['2022-08-29', '2022-08-30', '2022-08-29',
'2022-08-30', '2022-08-29', '2022-08-30'],
'End': [100, 110, 0, 210, 300, 310]}) ```
sku date End
0 A 2022-08-29 100
1 A 2022-08-30 110
2 B 2022-08-29 0
3 B 2022-08-30 210
4 C 2022-08-29 300
5 C 2022-08-30 310
I want today's starting value
to be yesterday's end value
.
So, I want to create a column 'Start' such as:
Start = End
, for the same sku, but for the previous date's End value
(different row).
This should apply only to the rows where there is a previous date for that sku.
The result should look like this:
sku date End Start
0 A 2022-08-29 100
1 A 2022-08-30 110 100
2 B 2022-08-29 0
3 B 2022-08-30 210 0
4 C 2022-08-29 300
5 C 2022-08-30 310 300
In my real dataset, I have values for 'End' for every date (no missing values, just 0's), so this should apply to every date except the first one. The first date is the same for every sku.
I've tried iterating over the DataFrame using for
loops, but it's inefficient and ugly.
Is there a way to do this using a pandas function like assign
or apply
?