2

I have a problem in python. My table in xlsx. looks like this:

COMPANY ID NUMBER FUNCTION NAME
A 123 director Smith
A 123 partner Jones
A 123 secretary Evans
B 456 partner Brown
B 456 manager Wilson

And I need convert rows which have same ID NUMBER to columns. I need all rows with same id to appear on only one row. Like this:

COMPANY ID NUMBER FUNCTION FUNCTION 2 FUNCTION 3 NAME NAME 2 NAME 3
A 123 director partner secretary Smith Jones Evans
B 456 partner manager Brown Wilson
Michael S.
  • 3,050
  • 4
  • 19
  • 34
HEV
  • 55
  • 4
  • I think [this](https://stackoverflow.com/a/73400200/19534628) question on SO from yesterday is very similar. Maybe it helps :) – Jacob Aug 19 '22 at 12:11

1 Answers1

2

First greate helper column g by GroupBy.cumcount, pivoting by DataFrame.pivot and then in list comprehension flatten MultiIndex, last convert Index to columns:

df['g'] = df.groupby(['COMPANY','ID NUMBER']).cumcount()

df = df.pivot(['COMPANY','ID NUMBER'], 'g').fillna('')
df.columns = [f'{a} {b+1}' if b > 0 else a for a, b in df.columns]
df = df.reset_index().rename_axis(None, axis=1)
print (df)
  COMPANY  ID NUMBER  FUNCTION FUNCTION 2 FUNCTION 3   NAME  NAME 2 NAME 3
0       A        123  director    partner  secretary  Smith   Jones  Evans
1       B        456   partner    manager             Brown  Wilson       
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252