I have a DataFrame
date price num_floors house
1 2023-01-01 94.30076 3 A
2 2023-01-01 95.58771 2 B
3 2023-01-02 102.78559 1 C
4 2023-01-03 93.29053 3 D
and I want to change it, so that each column contains the prices and num_floors for all houses for a given date. For one column, the first two rows of a column refer to the first house, the next two to the second house. The remaining entries without data are filled with the missing value NaN, like this:
2023-01-01 2023-01-02 2023-01-03
1 94.30076 102.78559 93.29053
2 3 1 3
3 95.58771 NA NA
4 2 NA NA
I succeed using R:
df_trans <- df %>%
pivot_longer(-date) %>%
mutate(index=row_number(), .by = date) %>%
pivot_wider(id_cols = index, names_from = date, values_from = value) %>%
select(-index)
but when I try with python, it does not work as I want:
df_trans = df.melt(id_vars='date')
df_trans['n'] = df_trans.groupby('date').cumcount() + 1
df_trans = df_trans.pivot(index='n', columns='date', values='value')