Questions tagged [summarize]

A dplyr instruction ( actually named summarise( ) ) to create a new data frame by grouping data according to given grouping variables. Use this tag along with the dplyr version being used. Mind the spelling in the method name.

summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row (or more, as of dplyr 1.0.0) summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

836 questions

votes

3 answers

r min max dates by id and multiple status changes within ID

I have an animal tracking dataset which is as shown below Id Start Stop Status 78122 10/12/1919 10/12/1919 Birth 78122 1/18/1966 2/2/1972 In 78122 2/3/1972 9/8/1972 In 78122 9/9/1972…

r date dplyr summarize

asked Jan 06 '21 at 18:36

Riley Stephen

votes

1 answer

Accessing other group_by groups with summarize()

I have a data frame with columns genes, the region of the chromosome they belong to, the cell line the gene expression was measured from, and the gene's expression level in that cell line -- it looks basically something like this: gene region …

r dplyr group-by summarize

asked Aug 27 '20 at 14:52

cheal

votes

2 answers

Summarize using different grouping variables in dplyr

I would like summarize a dataframe using different grouping variables for each summary I wish to be carried out. As an example I have three variables (x1, x2, x3). I want to group the dataframe by x1 and get the number of observations in that group,…

r dplyr grouping summarize

asked May 08 '20 at 17:37

H. Kraus

votes

3 answers

How to keep other columns when using dplyr?

I have a similar problem as described How to aggregate some columns while keeping other columns in R?, but none of the solutions from there which I have tried work. I have a data frame like…

r group-by dplyr summarize

asked Mar 30 '20 at 11:51

Zizou

votes

1 answer

R sum observations by unique column PAIRS (B-A and A-B) and NOT unique combinations (B-A or A-B)

I have a seemingly simple calculation, where I have a data frame composed of 4 columns as shown below (Date, Origin, Destination, count). I would like to sum the count by Date, and unique pair of ID1 and ID2, meaning that A-B and B-A are ONE…

r dplyr summarize

asked Mar 29 '20 at 18:55

Roberto

votes

3 answers

dplyr summarise based on order condition with if statement

By group (group_by(id)), I am trying to sum a variable based on a selection of types. However, there is an order of preference of these types. Example: library(tidyverse) df <- data.frame(id = c(rep(1, 6), 2, 2, 2, rep(3, 4), 4, 5), …

r dplyr summarize

asked Mar 13 '20 at 14:10

user63230

4,095
21
43

votes

1 answer

How to summarize large dataframes in python pandas (50 columns x 2m rows)

For a project i manipulate a few columns of the dataset and afterwards join these newly created columns back to the entire dataset and then summarize on the manipulated fields. The manipulation and merging is no problem, but the groupby feature…

python python-3.x pandas-groupby summarize

asked Oct 20 '19 at 09:59

Dubblej

votes

2 answers

Summarize with conditions based on ranges in dplyr

There is an illustration of my example. Sample data: df <- data.frame(ID = c(1, 1, 2, 2, 3, 5), A = c("foo", "bar", "foo", "foo", "bar", "bar"), B = c(1, 5, 7, 23, 54, 202)) df ID A B 1 1 foo 1 2 1 bar 5 3 2 foo 7 4 2 foo …

r dplyr summarize

asked Oct 10 '19 at 09:56

Vojtěch Kania

votes

3 answers

Power BI/DAX: Filter SUMMARIZE or GROUPBY by added column value

because of confidential nature of data, I'll try to describe what I'm struggling with using some random examples. Let's say I have a fact table with invoices data in Power BI. I need to count number of distinct product ID's with sales over let's say…

filter powerbi dax summarize

asked Oct 04 '19 at 12:10

Uzzy

votes

3 answers

count distinct levels of a data frame for groups based on a condition

I have the following DF x = data.frame('grp' = c(1,1,1,2,2,2),'a' = c(1,2,1,1,2,1), 'b'= c(6,5,6,6,2,6), 'c' = c(0.1,0.2,0.4,-1, 0.9,0.7)) grp a b c 1 1 1 6 0.1 2 1 2 5 0.2 3 1 1 6 0.4 4 2 1 6 -1.0 5 2 2 2 0.9 6 2 1 6 0.7 I…

r group-by dplyr conditional-statements summarize

asked Sep 30 '19 at 12:48

Param

votes

1 answer

Writing a function to filter and summarize data into proportion table

I want to create a large proportion table that involves filtering out certain values based on one column and outputting the proportion of values equal to 0 and those greater than 0 in table. Here's an example of the data frame (df): ID a b …

r filter summarize

asked Jul 02 '19 at 17:18

Kfin

votes

2 answers

How to add secondary summary of previously grouped/summarized data for purposes of sorting in R with dplyr

I am plotting two groups - before and after Each group has 2 levels - up, down For each level I have calculated the summary stat, count I am trying to create new summary stat which is the total count of each level in the database, new_count …

r dplyr summarize

asked Jun 04 '19 at 11:16

E50M

votes

1 answer

sum count across multiple variables

I feel like this should be very easy, but I can't get it to work. Data are the three columns, fourth column is what I am looking for that I can't get to render out: eg_data <- data.frame( id = c(1,1,1,2,2,3,3,3,3,3,3,4,4,5,5,5,5), date = c("11/1",…

r group-by average summarize

asked Nov 08 '18 at 22:17

Adam_S

votes

2 answers

Summarize data table individually for multiple columns

I am trying to summarize data across multiple columns automatically if at all possible rather than writing code for each column independently. I would like to summarize this: Patch Size Achmil Aciarv Aegpod Agrcap A 10 …

r dataframe dplyr summarize

asked Mar 26 '18 at 09:56

Kevin

votes

2 answers

counting the occurrence of substrings in a column in R with group by

I would like to count the occurrences of a string in a column ....per group. In this case the string is often a substring in a character column. I have some data e.g. ID String village 1 fd_sec, ht_rm, A 2 NA, ht_rm …

r summarize

asked Feb 19 '18 at 14:01

Rebecca Stedham

Prev 1 2 3

…

55 56 Next