Questions tagged [summarize]

A dplyr instruction ( actually named summarise( ) ) to create a new data frame by grouping data according to given grouping variables. Use this tag along with the dplyr version being used. Mind the spelling in the method name.

summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row (or more, as of 1.0.0) summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

836 questions
2
votes
1 answer

dplyr: workflow to subset, summarize, and mutate new function

I am trying to figure out the most efficient way to achieve a series of goals to group my data, summarize columns, and mutate a new column based on the summary. With the example data below, I want to: mutate a new column "sum", which would be the…
slane
  • 61
  • 5
2
votes
1 answer

aggregate function in R, sum of NAs are 0

I saw a list of questions asked in stack overflow, regarding the following, but never got a satisfactory answer. I will follow up on the following question Blend of na.omit and na.pass using aggregate? > test <- data.frame(name = rep(c("A", "B",…
okboi
  • 23
  • 2
2
votes
1 answer

Summarize and count the number of unique values in grouped df with dplyr

I have this df: structure(list(CN = c("BR", "BR", "BR", "PL", "PL", "PL", "BR", "BR", "BR", "BR", "PL", "PL", "PL"), Year = c(2019, 2019, 2019, 2019, 2019, 2019, 2020, 2020, 2020, 2020, 2020, 2020, 2020), Squad = c("A", "B", "C", "A", "B", "C",…
Cristiano
  • 233
  • 1
  • 9
2
votes
1 answer

Formatting dates when converting to list

I have a table like this: ID Date Status 101 2020-09-14 1 102 2020-09-14 1 103 2020-09-14 1 104 2020-09-14 2 105 2020-09-14 2 106 2020-09-14 2 But want a table like this: Status ID Date 1 101,102,103 2020-09-14,…
user17582908
2
votes
4 answers

How to use R summarise with multiple numeric and text-based conditional subsets

I have a table containing two rows for each ID. table <- tibble( id = c(1,1,2,2,3,3,4,4,5,5), row1 = c(2,5,2,5,1,3,2,5,3,2), row2 = c("foo", "other foo", "bar", "bar", "bar", "bar other", "other", "foo", "other", "other") ) > table # A tibble:…
2
votes
1 answer

How do I summarize unique values of group in one column using DPLYR?

At the moment I have the following code: categories <- df %>% #this is a very large df but that should not matter to my question group_by(category, subcategory, IV_type) %>% summarise(n = n()) Which produces the…
DeMelkbroer
  • 629
  • 1
  • 6
  • 21
2
votes
2 answers

Trying to find count in R "pivot table"

library(dplyr) data %>% select(trade, before.pay,after.pay,four.after.pay) %>% group_by(trade) %>% summarise(across(everything(), .f = list(median = median, max = max, min = min, count = n), na.rm = TRUE)) I can get this to work for mean,…
Ben
  • 153
  • 1
  • 2
  • 8
2
votes
2 answers

How summarize points to linestring and keep dataframe columns in r?

I'm working with this code to turn a group of points into lines. But, in addition to the "sub_id" the rows have another "id" (column in input dataframe) that I would like to be kept in the final object. How can I do…
2
votes
3 answers

count nonzero values in each column tidyverse

I have a df with a bunch of sites and a bunch of variables. I need to count the number of non-zero values for each site. I feel like I should be able to do this with summarize() and count() or tally(), but can't quite figure it out. reprex: df <- …
Jake L
  • 987
  • 9
  • 21
2
votes
1 answer

R summarize across with multiple functions

I have a data frame where I am grouping by county, and then trying to summarize teh rest of the data using summarise across. Some of the variables I would like to sum across, while other variables I would like to average across Here is my sample…
KLenny
  • 85
  • 6
2
votes
1 answer

Dplyr: using summarise across to take mean of columns only if row value > 0

I have a dataframe of gene expression scores (cells x genes). I also have the cluster that each cell belongs to in stored as a column. I want to calculate the mean expression values per cluster for a group of genes (columns), however, I only want to…
Darren
  • 277
  • 4
  • 17
2
votes
1 answer

grouped summarize still gives result for each individual row

I have the following data: library(tidyverse) df <- data.frame(id = c(1,1,1,2,2,2), x = rep(letters[1:2], each = 3), y = c(3,4,3,5,6,5), z = c(7,8,9,10,11,12)) I now want to summarize the data by…
deschen
  • 10,012
  • 3
  • 27
  • 50
2
votes
1 answer

Combine columns of a matrix and replace with means

I have a matrix with many columns that need to be combined separately. I want to combine them and take the means across rows and then put them into a new matrix. TYIA I have a matrix called data like this: p7.1 p7.2 p7.3 p8.1 p8.2 …
jimuq233
  • 53
  • 4
2
votes
2 answers

Subtracting cell in one row from cell in another row when summarizing grouped data with dplyr?

Background: I have data from a simulation where I have a few variables and thus many resulting combinations of parameters. Due to the internal design of the simulation there can be a little variation among the outcomes of identical sets of…
Stan Rhodes
  • 360
  • 2
  • 14
2
votes
1 answer

Is there a way to get a COUNTIF like summary in R that also shows proportions?

I am trying to summarise between my variables in R, and my data looks like this: id Function V t 1 adverb 0 1 2 discourse 1 1 3 filler 1 0 4 discourse 1 1 5 filler 0 0 6 adverb 1 1 7 adverb 1 1 What I…