Questions tagged [summarize]

A dplyr instruction ( actually named summarise( ) ) to create a new data frame by grouping data according to given grouping variables. Use this tag along with the dplyr version being used. Mind the spelling in the method name.

summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row (or more, as of dplyr 1.0.0) summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

836 questions

votes

3 answers

Divide group sum by total sum

I am using the dplyr package. Let's suppose I have the below table. Group count A 20 A 10 B 30 B 35 C 50 C 60 My goal is to create a summary table that contains the mean per each group, and also, the percentage of the mean of…

r dataframe dplyr summarize

asked Jul 14 '22 at 15:10

GitZine

votes

1 answer

R: Combine rows with same ID

Edit: I changed Var4 to a string value as my question was not precise enough about my data and therefore answers were failing because of invalid types. Sorry for that this is my first question here and I hope someone can help me. I have the…

r merge summarize

asked Jul 04 '22 at 07:39

Aisberg

votes

2 answers

Power BI DAX How to add column to a calculated table that summarizes another

I Have a TestTable that summarizes a table Receipts on the Month column and adds a column that counts the number of times (occurence) that each month appears in the Receipts Table. TestTable = SUMMARIZE(Receipts, Receipts[Month],…

powerbi dax summarize

asked Jun 27 '22 at 05:33

Sweepster

1,829
4
27
66

votes

1 answer

Kusto - Join two tables and count keys from first table and second table on every record from first table

Need to Join two tables and count key from first table and second table on every record from first table let T = datatable(TId:int, TName:string, Tkey:string) [ 1, "A", "xyz", 2, "B", "xyz", 3, "C", "yza", ]; let u = datatable(UId:int,…

azure-data-explorer kql summarize kusto-explorer

asked Jun 13 '22 at 13:09

Sahil Raj

votes

1 answer

how to apply a function(x,y) with two variables across set of variables ending with .x and .y using dplyr

Sample data: sampdat <- data.frame(grp=rep(c("a","b","c"),c(2,3,5)), x1=seq(0,.9,0.1),x2=seq(.3,.75,0.05), y1=c(1:10), y2=c(11:20)) I would like to have the following data, but i have 100+ variables for which i'd like to apply a function with two…

r dplyr multiple-columns summarize across

asked May 26 '22 at 15:55

Sam

votes

3 answers

How to combine multiple summarize calls dplyr?

Given the df ww <- data.frame( GM = c("A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "C"), stanza = rep(c("Past", "Mid", "End"), 6), change = c(1, 1.1, 1.4, 1, 1.3, 1.5, 1,…

r dplyr group-by summarize

asked May 20 '22 at 15:08

Jacob

votes

2 answers

R group columns of return trips data

I have data of train trips and the number of delayed or cancelled trains that I would like to make the sum. Start End Delayed Cancelled Paris Rome 1 0 Brussels Berlin 4 6 Berlin Brussels 6 2 Rome …

r summarize columnsorting group

asked Apr 26 '22 at 13:47

hug

votes

2 answers

How to replace na in a column with the first non-missing value without dropping cases that only have missing values using R?

I have a long data frame that has many NAs, but I want to condenses it so all NAs are filled with the first non-missing value when grouped by a variable--but if the observation only has NAs, it keeps it. Until I updated R, I had a code that worked…

r dplyr grouping missing-data summarize

asked Mar 21 '22 at 17:39

J.Sabree

2,280
19
48

votes

1 answer

summarise_all with additional parameter that is a vector

Say I have a data frame: df <- data.frame(a = 1:10, b = 1:10, c = 1:10) I'd like to apply several summary functions to each column, so I use dplyr::summarise_all library(dplyr) df %>% summarise_all(.funs =…

r dplyr summarize

asked Mar 11 '22 at 17:06

Dan

11,370
4
43
68

votes

4 answers

Sum rows with value larger than n into one in R

I have a data frame: df <- data.frame(count=c(0,1,2,3,4,5,6), value=c(100,50,60,70,2,6,8)) count value 1 0 100 2 1 50 3 2 60 4 3 70 5 4 2 6 5 6 7 6 8 How do I sum value larger than "n" into one…

r dataframe dplyr summarize

asked Aug 26 '21 at 17:20

Algorithman

1,309
1
16
39

votes

2 answers

"group_by->summarise->mean()" taking way longer than expected

I have a dataset of around 4.2 million observations. My code is below: new_dataframe = original_dataframe %>% group_by(user_id, date) %>% summarise(delay = mean(delay, na.rm=TRUE) ) This pipeline should be taking a 4.2 million x 3…

r dplyr group-by tidyverse summarize

asked Jul 07 '21 at 22:23

tvbc

votes

2 answers

Perform group by on a column to calculate count of occurrences of another column in R

I have a dataset similar to sample dataset provided below: | Name | Response_days | state | |------|---------------|-------| | John | 0 | NY | | John | 6 | NY | | John | 9 | NY | | Mike | 3 |…

r group-by count summarize

asked May 23 '21 at 07:41

hk2

votes

4 answers

Summary statistics for multiple variables with statistics as rows and variables as columns?

I'm trying to use dplyr::summarize() and dplyr::across() to obtain a tibble with several summary statistics in the rows and the variables in the columns. I was only able to achieve this result by using dplyr::bind_rows(), but I'm wondering if…

r dplyr tidyverse summarize across

asked May 18 '21 at 15:54

Lucas De Abreu Maia

votes

3 answers

Summarise multiple columns that have to be grouped tidyverse

I have a data frame containing data that looks something like this: df <- data.frame( group1 = c("High","High","High","Low","Low","Low"), group2 = c("male","female","male","female","male","female"), one =…

r tidyverse summarize

asked May 12 '21 at 06:12

Jeff238

votes

3 answers

How to aggregate R dataframe of two columns based on values of another

My dataframe is as follows in which gender=="1" refers to men and gender=="2" refers to women, Occupations go from A to U and year goes from 2010 to 2018 (I give you a small example) Gender Occupation Year 1 A 2010 1 …

r dataframe aggregate summarize

asked Apr 30 '21 at 17:44

Ana

Prev 1 2 3

…

55 56 Next