My input is a pandas DataFrame :
item foo_x foo_y bar_x bar_y
0 1 A B C D
1 2 D E F G
2 3 H I J K
3 4 L M N O
df = pd.DataFrame({'item': [1, 2, 3, 4],
'foo_x': ['A', 'D', 'H', 'L'],
'foo_y': ['B', 'E', 'I', 'M'],
'bar_x': ['C', 'F', 'J', 'N'],
'bar_y': ['D', 'G', 'K', 'O']})
I'm not asking too much to the groupby
method, I only expect this standard aggregation :
item x y
0 1 [A, C] [B, D]
1 2 [D, F] [E, G]
2 3 [H, J] [I, K]
3 4 [L, N] [M, O]
But my code below gives a nonsense error :
df_output = (
df.rename(lambda x: x.split("_")[-1], axis=1)
.groupby(level=0, axis=1).agg(list)
)
ValueError: Length of values (2) does not match length of index (4)
To be honest, this is absolutely counterintuitive based on how we're used to apply groupby(..., axis=0)
.
Can you please explain the logic behind ?