Questions tagged [aggregate]

Aggregate refers to the process of summarizing grouped data, commonly used in Statistics.

Aggregate refers to the process of summarizing grouped data, commonly used in Statistics. Typically this involves replacing groups of data with single values (e.g. sum, mean, standard deviation, etc.). In SQL databases and data manipulation libraries such as pandas in python, this is accomplished with the use of GROUP BY and aggregate functions.

Documentation:

8256 questions

votes

10 answers

Select the top N values by group

This is in response to a question asked on the r-help mailing list. Here are lots of examples of how to find top values by group using sql, so I imagine it's easy to convert that knowledge over using the R sqldf package. An example: when mtcars is…

r aggregate

asked Feb 10 '13 at 16:59

Anthony Damico

5,779
7
46
77

votes

4 answers

Aggregating by unique identifier and concatenating related values into a string

I have a need that I imagine could be satisfied by aggregate or reshape, but I can't quite figure out. I have a list of names (brand), and accompanying ID number (id). This data is in long form, so names can have multiple ID's. I'd like to…

r concatenation aggregate

asked May 16 '13 at 20:08

roody

2,633
5
38
50

votes

6 answers

Rename result columns from Pandas aggregation ("FutureWarning: using a dict with renaming is deprecated")

I'm trying to do some aggregations on a pandas data frame. Here is a sample code: import pandas as pd df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"], "Amount": [10.0, 5.0, 8.0, 10.5, 7.5,…

python pandas aggregate rename

asked Jun 19 '17 at 16:28

Victor Mayrink

1,064
1
13
24

votes

3 answers

How to include BIT type column in SELECT part with out including it on the GROUP BY in T-SQL?

Here is my T-SQL query SELECT ProductID, VendorID, ProductName= MAX(ProductName), VendorName = MAX(VendorName), IsActive = MAX(IsActive) # This brings error FROM ProductVendorAssoc GROUP BY ProductID, VendorID I…

sql tsql group-by aggregate

asked May 19 '11 at 04:40

Mithun Sreedharan

49,883
70
181
236

votes

6 answers

Select the first and last row by group in a data frame

How can I select the first and last row for each unique id in the following dataframe? tmp <- structure(list(id = c(15L, 15L, 15L, 15L, 21L, 21L, 22L, 22L, 22L, 23L, 23L, 23L, 24L, 24L, 24L, 24L), d = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,…

r dataframe aggregate

asked Nov 20 '11 at 18:52

Francesco

votes

11 answers

Solution for SpecificationError: nested renamer is not supported while agg() along with groupby()

def stack_plot(data, xtick, col2='project_is_approved', col3='total'): ind = np.arange(data.shape[0]) plt.figure(figsize=(20,5)) p1 = plt.bar(ind, data[col3].values) p2 = plt.bar(ind, data[col2].values) plt.ylabel('Projects') …

python pandas aggregate

asked Feb 14 '20 at 15:40

Akshay Jindal

votes

10 answers

Why is there no PRODUCT aggregate function in SQL?

Im looking for something like SELECT PRODUCT(table.price) FROM table GROUP BY table.sale similar to how SUM works. Have I missed something on the documentation, or is there really no PRODUCT function? If so, why not? Note: I looked for the function…

sql aggregate

asked Oct 12 '10 at 06:42

lock

6,404
18
58
76

votes

8 answers

Aggregate a dataframe on a given column and display another column

I have a dataframe in R of the following form: > head(data) Group Score Info 1 1 1 a 2 1 2 b 3 1 3 c 4 2 4 d 5 2 3 e 6 2 1 f I would like to aggregate it following the Score column…

r aggregate plyr greatest-n-per-group

asked Jun 09 '11 at 07:37

jul635

votes

9 answers

Explain the aggregate functionality in Spark (with Python and Scala)

I am looking for some better explanation of the aggregate functionality that is available via spark in python. The example I have is as follows (using pyspark from Spark 1.2.0 version) sc.parallelize([1,2,3,4]).aggregate( (0, 0), (lambda acc,…

python scala apache-spark aggregate rdd

asked Jan 30 '15 at 16:49

ab_tech_sp

votes

2 answers

Aggregate multiple columns at once

I have a data-frame likeso: x <- id1 id2 val1 val2 val3 val4 1 a x 1 9 2 a x 2 4 3 a y 3 5 4 a y 4 9 5 b x 1 7 6 b y 4 4 7 b x 3 9 8 b y 2 8 I wish to aggregate the…

r aggregate

asked Dec 30 '15 at 05:50

Rookie

5,179
13
41
65

votes

4 answers

Why can't indexed views have a MAX() aggregate?

I have been trying out a few index views and am impressed but I nearly always need a max or a min as well and can not understand why it doesn't work with these, can anyone explain why? I KNOW they are not allowed, I just can't understand why!!!…

sql-server view indexing aggregate

asked Jan 25 '10 at 18:09

alex

votes

2 answers

aggregate methods treat missing values (NA) differently

Here's a simple data frame with a missing value: M = data.frame( Name = c('name', 'name'), Col1 = c(NA, 1) , Col2 = c(1, 1)) # Name Col1 Col2 # 1 name NA 1 # 2 name 1 1 When I use aggregate to sum variables by group ('Name') using the…

r aggregate na

asked May 30 '13 at 19:43

Ryan Walker

3,176
1
23
29

votes

4 answers

LINQ: How do I concatenate a list of integers into comma delimited string?

It's probably something silly I missed, but I try to concatenate a list of integers instead of summing them with: integerArray.Aggregate((accumulator, piece) => accumulator+"," + piece) The compiler complained about argument error. Is there a slick…

c# linq aggregate

asked May 26 '10 at 23:32

Haoest

13,610
29
89
105

votes

5 answers

Aggregate Function on Uniqueidentifier (GUID)

Let's say I have the following table: category | guid ---------+----------------------- A | 5BC2... A | 6A1C... B | 92A2... Basically, I want to do the following SQL: SELECT category, MIN(guid) FROM myTable GROUP BY…

sql sql-server guid aggregate uniqueidentifier

asked May 20 '11 at 08:33

Heinzi

167,459
57
363
519

votes

4 answers

Pandas groupby(),agg() - how to return results without the multi index?

I have a dataframe: pe_odds[ [ 'EVENT_ID', 'SELECTION_ID', 'ODDS' ] ] Out[67]: EVENT_ID SELECTION_ID ODDS 0 100429300 5297529 18.00 1 100429300 5297529 20.00 2 100429300 5297529 21.00 3 100429300 5297529 …

python pandas group-by aggregate multi-index

asked Oct 12 '14 at 10:01

Ginger

8,320
12
56
99

Prev 1 2

…

99 100 Next