Questions tagged [pandas-groupby]

To be used for grouping variables together based on a given condition. And only to be used with relevance to `pandas` library

pandas.DataFrame.groupby allows you to group variables in a DataFrame or a certain number of columns in different categories.

After grouping, one can also obtain the mean and perform other operations as well.

8780 questions
2
votes
2 answers

pandas groupby add and average at the same time

I have a data frame with a list of processes and the time they took as follows I would like to get the following result I know how to use gorupby in order to get ONE but only one of those columns. And this is the way I solve the problem # the…
JFerro
  • 3,203
  • 7
  • 35
  • 88
2
votes
2 answers

Pandas Transforming the Applied Results back to the original dataframe

Consider the Following DataFrame candy = pd.DataFrame({'Name':['Bob','Bob','Bob','Annie','Annie','Annie','Daniel','Daniel','Daniel'], 'Candy': ['Chocolate', 'Chocolate', 'Lollies','Chocolate', 'Chocolate', 'Lollies','Chocolate', 'Chocolate',…
Rigel
  • 47
  • 3
2
votes
3 answers

Get sum of positive values and sum of negative values

I have a DataFrame as follows. d = {} d['Model'] = ['M1', 'M2'] * 4 * 3 * 5 d['Support'] = ['S1', 'S1', 'S2', 'S2'] * 2 * 3 * 5 d['Group'] = ['G1', 'G2', 'G2'] * 2 * 4 * 5 d['Case'] = ['C11', 'C21', 'C22', 'C31', 'C32'] * 2 * 4 * 3 val1 = [] val2 =…
PedroBiel
  • 489
  • 1
  • 8
  • 21
2
votes
1 answer

How to keep that line space after grouping in dataframe?

The input dataframe have Name Subject IA1 IA2 IA3 Abc DS 45 43 34 Abc DMS 43 23 45 Abc ADA 32 46 36 Bcd BA 45 35 37 Bcd EAD 23 45 12 Bcd DS 23 …
CodingNinja
  • 109
  • 6
2
votes
2 answers

Aggregate previous rows of into lists, depending on separate column

I asked a similar, yet simpler, question previously but realized later that this does not solve my problem. I feel that the required edits to the question are too severe for a simple edit of the question, especially with two valid answers already,…
Marcus
  • 943
  • 5
  • 21
2
votes
2 answers

Python - Find Days Diff Grouping by Member

I am looking to find an efficient way to generate the number of days diff from an initial point in time. The initial point in time the first time (min date) an ID is triggered by the value Y in the Indicator column. Input ID Indicator Date 111 Y…
Logan
  • 293
  • 3
  • 11
2
votes
1 answer

Summing up collections.Counter objects using `groupby` in pandas

I am trying to group the words_count column by both essay_Set and domain1_score and adding the counters in words_count to add the counters results as mentioned here: >>> c = Counter(a=3, b=1) >>> d = Counter(a=1, b=2) >>> c + d …
Hazem Alabiad
  • 1,032
  • 1
  • 11
  • 24
2
votes
2 answers

Mean Aggregations using pandas GroupBy and Time Series resampling

I'm having trouble with Pandas groupby functionality and Time Series. I've read the documentation, but I can't see to figure out how to apply aggregate functions to multiple columns and calculate the mean of the volume (average) of the „aggregate “…
2
votes
2 answers

Simple calculation on table. Please help me to make my code more effective

Please help me to make my code more effective. This is my df: df = pd.DataFrame([['A', 80], ['A', 64], ['A', 55], ['B', 56], ['B', 89], ['B', 73], ['C', 78], ['C', 100], ['C', 150], ['C', 76], ['C', 87]], columns=['Well', 'GR']) Well GR A …
elli
  • 35
  • 5
2
votes
2 answers

Replacing missing values with mean

I am exploring pandas library, and I'd find this dataset. My task is to fill ? with mean of by group of column 'num-of-doors'. When I used dataframe.groupby('num-of-doors').mean() pandas was unable to find mean of these columns: 'peak-rpm',…
Harshil Modi
  • 397
  • 3
  • 8
  • 15
2
votes
1 answer

Python Pandas groupby multiple columns and append

I'm close to achieve what I want thanks to Python Pandas Groupby/Append columns but still not quite there. DF: City Plan Problem 1 Problem 2 Problem 3 Genoa Service 1 aaa bbb ccc Genoa Service 2 ddd zzz yyy Genoa Service…
2
votes
1 answer

How to group and aggregate data using pandas/Python only if a specific condition/calculation is met?

There is a pandas.DataFrame df that looks like this: City Country Latitude Longitude Population ... Berlin Germany 52.516602 13.304105 118704 Berlin Germany 52.430884 13.192662 292000 ... Berlin USA …
jengeb
  • 393
  • 1
  • 7
  • 21
2
votes
1 answer

Number of consecutive TRUE's with shifts of 4 quarters

I have the following dataframe: Quarter Tic var 0 AAL False 1 AAL True 2 AAL True 3 AAL False 4 MSFT True 5 MSFT True 6 MSFT False 7 MSFT False 8 …
t.pellegrom
  • 313
  • 3
  • 10
2
votes
2 answers

how to use pandas to subtract rows of a column based upon data by group?

I'm trying to compute the difference between rows of a column based upon the first and last date, by group. for example: df = pd.DataFrame({'ID':["a","a","a","b","b","b"], 'start_yr':[2010,2013,2020,2009,2005,2019], …
nerdlyfe
  • 487
  • 7
  • 21
2
votes
1 answer

Calculate average revenue per user per month

I have the following dataframe: Timestamp userid Prices_USD 0 2016-12-01 6.270941895 1.08 1 2016-12-01 6.609813209 1.12 2 2016-12-01 6.632094115 9.70 3 2016-12-01 6.655789772 1.08 4 2016-12-01 6.764640751 9.33 ... ... ...…
Oam
  • 305
  • 5
  • 13
1 2 3
99
100