My Dataset consists of 3535560 rows × 16 columns
. However it contains three 'index-variables, I would like to use to reshape the dataset: 1830 days
, 46 latitude values
and 42 longitude values
. The transformed dataset should therefore become (1830,46,42) but I have no idea how to do this. I saw that I could maybe use pd.pivot
or pd.multiindex
, but I am not able to find a solution.
In short data looks like:
time lat lon var 1 var 2 var 3 var 4 var 5
0 2021-01-01 60.125 -120.125 0.381828 0.917779 0.718022 0.064032 0.886050
1 2021-01-01 60.125 -119.875 0.221697 0.232657 0.298497 0.680900 0.124440
...
41 2021-01-01 60.125 -109.125 0.922149 0.708139 0.778329 0.267685 0.552542
42 2021-01-01 59.875 -120.125 0.569874 0.053829 0.740229 0.747286 0.194214
43 2021-01-01 59.875 -119.875 0.500091 0.185990 0.845510 0.877692 0.556584
....
1931 2021-01-01 48.875 -109.125 0.221697 0.232657 0.298497 0.680900 0.124440
1932 2021-01-02 60.125 -120.125 0.666589 0.849857 0.338648 0.552114 0.730678
1933 2021-01-02 60.125 -119.875 0.351144 0.467692 0.161488 0.530906 0.277561
As you can see in the table, after 42 longitude values, it takes a new latitude value and loops over all longitude values again. it does this for all latitude values before going to the next day (after 1932 rows (46x42)).
Would someone be able to help me fix this?