Questions tagged [summarize]

A dplyr instruction ( actually named summarise( ) ) to create a new data frame by grouping data according to given grouping variables. Use this tag along with the dplyr version being used. Mind the spelling in the method name.

summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row (or more, as of 1.0.0) summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

836 questions
4
votes
3 answers

Getting summary by group and overall using tidyverse

I am trying to find a way to get summary stats such as means by group and overall in one step using dplyr #Data set-up sex <- sample(c("M", "F"), size=100, replace=TRUE) age <- rnorm(n=100, mean=20 + 4*(sex=="F"), sd=0.1) dsn <- data.frame(sex,…
SimRock
  • 229
  • 3
  • 10
4
votes
6 answers

Summarize a Variable by All But Group

I have a data.frame and I need to calculate the mean per "anti-group" (i.e. per Name, below). Name Month Rate1 Rate2 Aira 1 12 23 Aira 2 18 73 Aira 3 19 45 Ben 1 53 …
tubaguy
  • 149
  • 11
4
votes
4 answers

Replace multiple `summarize`statements by function

I'm currently repeating a lot code, since I need to summarize always the same columns for different groups. How can I do this effectively by writing the summarize function (which is always the same) only once, but define the output name and group_by…
huan
  • 308
  • 3
  • 15
4
votes
3 answers

dplyr summarise_all with quantile and other functions

I have a dataframe PatientA Height Weight Age BMI 1 161 72.2 27 27.9 2 164 61.0 21 22.8 3 171 72.0 30 24.6 4 169. 63.9 25 22.9 5 174. 64.4 27 21.1 6 160 50.9 …
Artem Zefirov
  • 423
  • 5
  • 14
4
votes
1 answer

add column total to new row in data frame R

Suppose I have the following data. A <- c(4,4,4,4) B <- c(1,2,3,4) C <- c(1,2,4,4) D <- c(3,2,4,1) data <- as.data.frame(rbind(A,B,C,D)) data <- t(data) data <- as.data.frame(data) > data A B C D V1 4 1 1 3 V2 4 2 2 2 V3 4 3 4 4 …
Ellie
  • 415
  • 7
  • 16
4
votes
2 answers

dplyr: summarise each column and return list columns

I am looking to summarize each column in a tibble with a custom summary function that will return different sized tibbles depending on the data. Let’s say my summary function is this: mysummary <- function(x) {quantile(x)[1:sample(1:5, 1)] %>%…
crlwbm
  • 502
  • 2
  • 10
4
votes
1 answer

Summarise_each for first non-NA value

Is there a way to instruct dplyr to use summarise_each with specification first and na.rm=TRUE? I have a dataframe with many NAs and numeric values. Column A is patient ID. I would like to summarise the dataframe according to patient ID by taking…
obruzzi
  • 456
  • 1
  • 4
  • 12
4
votes
1 answer

dplyr::summarize_at – sort columns by order of variables passed, then by order of functions applied

Problem By using dplyr::summarize_at() (or equivalent), I would like to get a table of summaries in which columns are sorted first by (G) order of grouping variables used, then by (V) order of variables passed and lastly by (F) order of functions…
GegznaV
  • 4,938
  • 4
  • 23
  • 43
4
votes
4 answers

R dplyr summarise multiple functions to selected variables

I have a dataset for which I want to summarise by mean, but also calculate the max to just 1 of the variables. Let me start with an example of what I would like to achieve: iris %>% group_by(Species) %>% filter(Sepal.Length > 5) %>% …
Jordi Vidal
  • 439
  • 1
  • 6
  • 10
3
votes
3 answers

Summarise proportions of character values across columns in table

In this kind of data frame: df <- data.frame( w1 = c("A","A","B","C","A"), w2 = c("C","A","A","C","C"), w3 = c("C","A","B","C","B") ) I need to calculate across all columns the within-column proportions of the character values.…
Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34
3
votes
1 answer

Why does using c() on a list column not work with dplyr summarize?

I have a list-column and I would like to use c() for each group to combine these lists in summarize. This should result in one row per group, but it does not (note the code was written using dplyr >= 1.1.0): library(dplyr) df <-…
LMc
  • 12,577
  • 3
  • 31
  • 43
3
votes
4 answers

Group and add variable of type stock and another type in a single step?

I want to group by district summing 'incoming' values at quarter and get the value of the 'stock' in the last quarter (3) in just one step. 'stock' can not summed through quarters. My example dataframe: library(dplyr) df <- data.frame ("district"=…
Modus
  • 33
  • 2
3
votes
1 answer

How do I count the percentage makeup of TRUES in a table?

This is the code used to derive the first table in my question. JH %>% group_by(ATT_ID, CAR=="B") %>% summarize(count = n(), .groups = "drop") ATT_ID CAR ==…
Antonio
  • 417
  • 2
  • 8
3
votes
2 answers

How to combine scattered values into one row using R

I have a dataset like this and i want the desired dataset as below. dat <- read.table(text="Id Bug Drug1 Drug2 A Staph NA S A Staph S NA A E.coli NA S A E.coli S NA", header=TRUE) dat.desired <- read.table(text="Id …
YASIR AA
  • 101
  • 3
3
votes
1 answer

How do I average values using group_by and summarise where some entries are NA?

I have the following dummy data. Country <- c("Afghanistan", "Afghanistan", "Afghanistan", "Albania", "Albania", "Albania") Year <- c(2001, 2002, 2003, 2001, 2002, 2003) Count <- c(15, 18, NA, 12, 17, 19) I want to find the mean of count by…
Jay Bee
  • 362
  • 1
  • 9
1 2
3
55 56