I have a dataframe with the following simplified structure:
import pandas as pd
d = {'id': [1, 2, 3, 4, 5],
'name': ["a", "b", "c", "d", "e"],
'country': ["uk", "spain", "france", "germany", "italy"],
'cat_01_2020': [10, 20, 30, 40, 50],
'cat_01_2019': [11, 21, 31, 41, 51],
'cat_01_2018': [12, 22, 32, 42, 52],
'cat_02_2020': [100, 200, 300, 400, 500],
'cat_02_2019': [111, 211, 311, 411, 511],
'cat_02_2018': [122, 222, 322, 422, 522],
'cat_03_2020': [1000, 2000, 3000, 4000, 5000],
'cat_03_2019': [1111, 2111, 3111, 4111, 5111],
'cat_03_2018': [1222, 2222, 3222, 4222, 5222]}
df = pd.DataFrame(data = d)
and I want to get this new df_target.
d_target = {'id': [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5],
'name': ["a", "a", "a", "b", "b", "b", "c", "c", "c", "d", "d", "d", "e", "e", "e"],
'country': ["uk", "uk", "uk", "spain", "spain", "spain", "france", "france", "france", "germany", "germany", "germany", "italy", "italy", "italy"],
'year': [2020, 2019, 2018, 2020, 2019, 2018, 2020, 2019, 2018, 2020, 2019, 2018, 2020, 2019, 2018],
'cat_01': [10, 11, 12, 20, 21, 22, 30, 31, 32, 40, 41, 42, 50, 51, 52],
'cat_02': [100, 111, 122, 200, 211, 222, 300, 311, 322, 400, 411, 422, 500, 511, 522],
'cat_03': [1000, 1111, 1222, 2000, 2111, 2222, 3000, 3111, 3222, 4000, 4111, 4222, 5000, 5111, 5222]}
df_target = pd.DataFrame(data = d_target)
df_target
To do this, I think I need to use the pandas crosstab function, first getting the years 2018, 2019 and 2020. Then I should be able to get cat_01, cat_02 and cat_03.
Does anyone know how I can do this?
Thank you very much in advance.
Best regards.