Questions tagged [pandas-groupby]

To be used for grouping variables together based on a given condition. And only to be used with relevance to `pandas` library

pandas.DataFrame.groupby allows you to group variables in a DataFrame or a certain number of columns in different categories.

After grouping, one can also obtain the mean and perform other operations as well.

8780 questions
90
votes
2 answers

Aggregation in Pandas

How can I perform aggregation with Pandas? No DataFrame after aggregation! What happened? How can I aggregate mainly strings columns (to lists, tuples, strings with separator)? How can I aggregate counts? How can I create a new column filled by…
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
90
votes
2 answers

How to get number of groups in a groupby object in pandas?

This would be useful so I know how many unique groups I have to perform calculations on. Thank you. Suppose groupby object is called dfgroup.
wolfsatthedoor
  • 7,163
  • 18
  • 46
  • 90
86
votes
5 answers

Python - rolling functions for GroupBy object

I have a time series object grouped of the type . grouped.sum() gives the desired result but I cannot get rolling_sum to work with the groupby object. Is there any way to apply rolling…
user1642513
82
votes
2 answers

Renaming Column Names in Pandas Groupby function

Q1) I want to do a groupby, SQL-style aggregation and rename the output column: Example dataset: >>> df ID Region count 0 100 Asia 2 1 101 Europe 3 2 102 US 1 3 103 Africa 5 4 100 Russia …
Baktaawar
  • 7,086
  • 24
  • 81
  • 149
77
votes
3 answers

Python Pandas Group by date using datetime data

I have a column Date_Time that I wish to groupby date time without creating a new column. Is this possible the current code I have does not work. df = pd.groupby(df,by=[df['Date_Time'].date()])
GoBlue_MathMan
  • 1,048
  • 2
  • 13
  • 20
77
votes
2 answers

What is the equivalent of SQL "GROUP BY HAVING" on Pandas?

what would be the most efficient way to use groupby and in parallel apply a filter in pandas? Basically I am asking for the equivalent in SQL of select * ... group by col_name having condition I think there are many uses cases ranging from…
Mannaggia
  • 4,559
  • 12
  • 34
  • 47
76
votes
2 answers

Use pandas.shift() within a group

I have a dataframe with panel data, let's say it's time series for 100 different objects: object period value 1 1 24 1 2 67 ... 1 1000 56 2 1 59 2 2 46 ... 2 1000 64 3 1 …
Alexandr Kapshuk
  • 1,380
  • 2
  • 13
  • 29
75
votes
7 answers

Pandas - dataframe groupby - how to get sum of multiple columns

This should be an easy one, but somehow I couldn't find a solution that works. I have a pandas dataframe which looks like this: index col1 col2 col3 col4 col5 0 a c 1 2 f 1 a c 1 2 f 2 a …
Axel
  • 2,545
  • 2
  • 18
  • 30
73
votes
4 answers

How to move pandas data from index to column after multiple groupby

I have the following pandas dataframe: token year uses books 386 xanthos 1830 3 3 387 xanthos 1840 1 1 388 xanthos 1840 2 2 389 xanthos 1868 2 2 390 xanthos 1875 1 1 I aggregate the…
prooffreader
  • 2,333
  • 4
  • 21
  • 32
71
votes
2 answers

Sample each group after pandas groupby

I know this must have been answered some where but I just could not find it. Problem: Sample each group after groupby operation. import pandas as pd df = pd.DataFrame({'a': [1,2,3,4,5,6,7], 'b': [1,1,1,0,0,0,0]}) grouped =…
gongzhitaao
  • 6,566
  • 3
  • 36
  • 44
65
votes
6 answers

What is the difference between pandas agg and apply function?

I can't figure out the difference between Pandas .aggregate and .apply functions. Take the following as an example: I load a dataset, do a groupby, define a simple function, and either user .agg or .apply. As you may see, the printing statement…
David D
  • 1,485
  • 4
  • 15
  • 19
62
votes
6 answers

How to do a conditional count after groupby on a Pandas Dataframe?

I have the following dataframe: key1 key2 0 a one 1 a two 2 b one 3 b two 4 a one 5 c two Now, I want to group the dataframe by the key1 and count the column key2 with the value "one" to get this result: key1 …
Sethias
  • 740
  • 1
  • 6
  • 7
61
votes
4 answers

Is there an "ungroup by" operation opposite to .groupby in pandas?

Suppose we take a pandas dataframe... name age family 0 john 1 1 1 jason 36 1 2 jane 32 1 3 jack 26 2 4 james 30 2 Then do a groupby() ... group_df = df.groupby('family') group_df =…
mkln
  • 14,213
  • 4
  • 18
  • 22
58
votes
3 answers

What is the pandas equivalent of dplyr summarize/aggregate by multiple functions?

I'm having issues transitioning to pandas from R where dplyr package can easily group-by and perform multiple summarizations. Please help improve my existing Python pandas code for multiple aggregations: import pandas as pd data = pd.DataFrame( …
B.Mr.W.
  • 18,910
  • 35
  • 114
  • 178
56
votes
3 answers

Use Pandas groupby() + apply() with arguments

I would like to use df.groupby() in combination with apply() to apply a function to each row per group. I normally use the following code, which usually works (note, that this is without groupby()): df.apply(myFunction, args=(arg1,)) With the…
beta
  • 5,324
  • 15
  • 57
  • 99