Questions tagged [summarize]

A dplyr instruction ( actually named summarise( ) ) to create a new data frame by grouping data according to given grouping variables. Use this tag along with the dplyr version being used. Mind the spelling in the method name.

summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row (or more, as of 1.0.0) summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

836 questions
1
vote
2 answers

Count the occurence of an element in the group without summarizing

I have dataset that looks like this: x <- data.table(id=c(1,1,1,2,2,3,4,4,4,4), cl=c("a","b","c","b","b","a","a","b","c","a")) I am trying to find the probability of a row getting picked for each group (id) based on the elements in cl. I tried the…
K_D
  • 147
  • 8
1
vote
1 answer

Is it possible with dplyr to filter a dataframe with output created by summarize within one pipe?

I got a dataframe with one numerical value and one 5 level factor variable. # set seed for reproducibility set.seed(123) df <- tibble(group = rep(c("a", "b", "c", "d", "e"), each = 20), values = c(rnorm(20, 0, 1), rnorm(20, 1, 1),…
Fabi_Nutri
  • 85
  • 5
1
vote
1 answer

dplyr, how to group observations based on codes, count and create summary variable then add a new variable based on names within the groups

I have multiple addresses I want to group together and create a tally for. However they have variation in the formats. I've geocoded the addresses and plan to group them using the geocodes however when grouping them I want to create a new variable…
Benj Young
  • 59
  • 5
1
vote
6 answers

Count characters and summarize values per group

I can seem to find a proper code for my problem. I want to create groups and summarize (sum, count or length) other columns based on different conditions. I've tried group_by and summarize with different conditions but haven't found anything that…
Eli
  • 43
  • 6
1
vote
1 answer

Applying same operation to several columns using tidyverse summarize

I'm trying to create a summary table that gives me the proportion of yes responses for 17 questions sorted by year. I just don't know how to apply the summarize operation to multiple columns easily without hard-coding it. Unfortunately, I can't use…
Tdag
  • 23
  • 3
1
vote
1 answer

Inconsistent ddply multiple quantiles by group

I am trying to use ddply to summarize median and 25th/75th precentiles of multiple groups in a relatively small data set. I am grouping by DoseWt the measured datapoints AUC_INFobs and Cmax. (Using R 4.0.4 in RStudio 1.3.1093 on Windows…
PHutson
  • 23
  • 4
1
vote
1 answer

R Summary table in percentage with summarise_at or _all using 3 different functions and reduce inner join

I have seen the post in here : https://github.com/tidyverse/dplyr/issues/3101 and tried to work with summarise_at, however it doesn't work with 3 functions, instead I found simpler with summarise_all. Is there any way to reduce the inner join from…
mjberlin15
  • 147
  • 1
  • 10
1
vote
2 answers

Use summarize and a for loop taking column names from a character vector

I have a dataset which I cannot share here, but I need to create columns using a for loop and the column names should come from a character vector. Below I try to replicate what I am trying to achieve using the flights dataset from the nycflights13…
Anup
  • 239
  • 2
  • 11
1
vote
1 answer

Summarize with a function that returns multiple values in a list

I have a function that receives two vectors and returns a list of parameters, similar to this one: f <- function(x, n) { mu <- sum(x)/sum(n) max <- max(x/n) min <- min(x/n) return(list(mu = mu, max = max, min = min)) } Now, I want…
salva
  • 9,943
  • 4
  • 29
  • 57
1
vote
2 answers

Multiple summary counts and a flag across variable number of rows

I have the following starting point: id.s <- c(1,1,2,2,2,3,3,3,3,4,4,4) test.s <- c("Negative", "Positive", "Positive", "Negative", "Positive", "Negative", "Negative", "Negative", "Positive", "Negative", "Negative", "Negative") Start…
dr_canak
  • 55
  • 3
1
vote
1 answer

using r to count character occurrences in multiple columns of data.frame

I'm new to R and have a data.frame with 100 columns. Each column is character data and I am trying to make a summary of how many times a character shows up in each column. I would like to be able to make a summary of all the columns at once without…
clions226
  • 81
  • 1
  • 9
1
vote
4 answers

Summary data by column in R

I have the following data pt_id <- c(1,1,1,1,1,2,2,2,3,3,3,3,3,4,4,4,4) Tob_pk <- c(2, 5, 7, 1, 8, 12, 14, 3, 6, 8, 10, 20, 13, 5, 4, 12, 10) Tobacco <- c("Once","Twice","Never", NA, NA, NA, NA, NA,"Once","Twice","Quit","Once",NA,NA,"Never", NA,…
PriyamK
  • 141
  • 10
1
vote
2 answers

Using mtcars data to make a summarised table of cylinders versus centered(mpg)

Bare with me... I am using the R/RStudio with the data mtcars, dplyr , mutate and the summarise commands. Also tried group by. I want to center the values mtcars$mpg then take that info and display the summary of the number of cylinders vs centered…
mccurcio
  • 1,294
  • 5
  • 25
  • 44
1
vote
0 answers

How can I summarize a factor with 4 levels

I've got 34 variables. One is a factor (continente) which has 4 levels and "PAT_ONCO" is one type of hospital patients. I want to group by continent but I can´t use summarize because continent doesn't have exactly 2…
1
vote
1 answer

Pass a concatenated string as column name in dplyr::summarise

I am trying perform dplyr summarize iteratively using concatenated string as column…
Vignesh
  • 912
  • 6
  • 13
  • 24