pivot wider but keep repeating the index

Question

I have a dataframe in pandas of the sort

df = pd.DataFrame({
    "id": [1, 1, 1, 1, 2, 2, 2, 2],
    "column": ["a", "b","a", "b", "a", "b", "a", "b"],
    "value": [1, 7, 6, 5, 4, 3, 1, 7]
})

I want to generate a new dataframe where we have

id	value_a	value_b
1	1	7
1	6	5
2	4	3
2	1	7

I tried many things, pivot, pivot table and so on, but all solutions seem to need index column which gets unique and the values being aggregated in some way. I want keep repeating id and have the values in the original order they appeared.

Thanks in advance!

could you have `column` values in arbitrary order like `["a", "b","c", "b", "b", "b", "a", "c"]` ? — RomanPerekhrest, Mar 12 '23 at 20:59

score 2 · Accepted Answer · answered Mar 12 '23 at 20:26

You can try:

x = (
    df.groupby(["id", "column"])["value"]
    .agg(list)
    .unstack(level=1)
    .explode(["a", "b"])
    .add_prefix('value_')
    .reset_index()
)
x.columns.name = None
print(x)

Prints:

   id value_a value_b
0   1       1       7
1   1       6       5
2   2       4       3
3   2       1       7

score 2 · Answer 2 · answered Mar 12 '23 at 22:29

Here is another way:

l = ['id','column']
(df.set_index([*l,df.groupby(l).cumcount()])['value']
.unstack(level=1)
.droplevel(1)
.rename('value_{}'.format,axis=1)
.reset_index()
.rename_axis(None,axis=1))

Output:

   id  value_a  value_b
0   1        1        7
1   1        6        5
2   2        4        3
3   2        1        7

score 1 · Answer 3 · answered Mar 12 '23 at 21:14

Concatenate dataframes filtered by column unique values:

res = pd.concat([df.loc[df.column.eq(c), ['id', 'value']]
                .set_index('id').set_axis([f'value_{c}'], axis=1)
                 for c in df.column.unique()], axis=1).reset_index()

   id  value_a  value_b
0   1        1        7
1   1        6        5
2   2        4        3
3   2        1        7

score 1 · Answer 4 · answered Mar 12 '23 at 22:13

Add a column that ensures uniqueness, then pivot:

out = (df
       .assign(counter = df.groupby(['id', 'column']).cumcount())
       .pivot(index = ['id', 'counter'], columns = 'column')
       .droplevel('counter')
      ) 
out.columns = out.columns.map("_".join)
out.reset_index()
   id  value_a  value_b
0   1        1        7
1   1        6        5
2   2        4        3
3   2        1        7

pivot wider but keep repeating the index

4 Answers4