0

Not sure that is feasible, but I'm trying to achieve the following with df.groupby(): let's say we have the following dataframe

name   target
A         1
A         2
A        0.5
B         3
B         1
B         2
C        0.6
C        1.2

and I want to group it based on name without losing the original information on target. My expected output would be something like:

name   target  count
A         1      3
          2
         0.5
B         3      3
          1
          2
C        0.6     2
         1.2
James Arten
  • 523
  • 5
  • 16

2 Answers2

1

You can use multi index

df = pd.DataFrame(data)
df['count'] = df.groupby('name').transform('count')
df = df.set_index(['name', 'count', 'target'])
df.head()

Update: As previous code will result in an empty dataframe, you can do the following:

df['count'] = df.groupby('name').transform('count')
df = df.set_index(['name', 'count'])
df.set_index(df.groupby(level=[0,1]).cumcount(), append=True).head()

Code is taken from Why does my multi-index dataframe have duplicate values for indices?

TanjiroLL
  • 1,354
  • 1
  • 5
  • 5
0

You can try to reset the values of 'name' to empty when grouping and write only on the first row, and for 'count' immediately create a column with empty rows and write the value in the first row.

Just keep in mind these are not multi-indexes. If where there are no values, these will be empty strings.

import pandas as pd

df['count'] = ''


def f(x):
    df.loc[x.index[0], 'count'] = x['name'].count()
    aaa = df.loc[x.index[0], 'name']
    df.loc[x.index, 'name'] = ''
    df.loc[x.index[0], 'name'] = aaa



df.groupby('name').apply(f)

print(df)

Output

  name  target count
0    A     1.0     3
1          2.0      
2          0.5      
3    B     3.0     3
4          1.0      
5          2.0      
6    C     0.6     2
7          1.2      
inquirer
  • 4,286
  • 2
  • 9
  • 16