3

I'm trying to get the total number of books that an author wrote and put it in a column called book number with my dataframe that has 15 other columns.

I checked online and people use groupby with count(), however it doesn't create the column that I want, it only gives a column of numbers without a name and I can't put it together with the original dataframe.

author_count_df = (df_author["Name"]).groupby(df_author["Name"]).count()

print(author_count_df)

Result:

Name
A  D                3
A  Gill             4
A  GOO              3
ALL  SHOT          10
AMIT  PATEL         5
                   ..
vishal  raina       7
walt  walter        6
waqas  alhafidh     3
yogesh  koshal      8
zainab  m.jawad     9
Name: Name, Length: 696, dtype: int64

Expected: A dataframe with

Name          other 14 columns from author_df   Book Number
A  D                    ...                         3
A  Gill                 ...                         4
A  GOO                  ...                         3
ALL  SHOT               ...                         10
AMIT  PATEL             ...                         5
                        ...                         ..
vishal  raina           ...                         7
walt  walter            ...                         6
waqas  alhafidh         ...                         3
yogesh  koshal          ...                         8
zainab  m.jawad         ...                         9
M. Albert
  • 35
  • 6

3 Answers3

3

Use transform with the groupby and assign it back:

df_author['Book Number']=df_author.groupby("Name")['Name'].transform('count')

For a new df, use:

author_count_df = df_author.assign(BookNum=df_author.groupby("Name")['Name']
                                                        .transform('count'))
anky
  • 74,114
  • 11
  • 41
  • 70
0

Use reset_index()

author_count_df = (df_author["Name"]).groupby(df_author["Name"]).count().reset_index()

This basically tells the pandas groupby to reset back to the original index

Pirate X
  • 3,023
  • 5
  • 33
  • 60
0

You have done the good Job except you need to check how to populate or assign the values back into a new column which you have got, Which you can achieve with DataFrame.assign method which does the Job quite elegantly.

Straight from the Docs:

  1. Assign new columns to a DataFrame.

  2. Returns a new object with all original columns in addition to new ones. Existing columns that are re-assigned will be overwritten.

Karn Kumar
  • 8,518
  • 3
  • 27
  • 53