Questions tagged [summarize]

A dplyr instruction ( actually named summarise( ) ) to create a new data frame by grouping data according to given grouping variables. Use this tag along with the dplyr version being used. Mind the spelling in the method name.

summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row (or more, as of 1.0.0) summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

836 questions
1
vote
0 answers

Is there a way to not show rows below a certain value in a table using Kable/KableExtra

I am trying to get a table for my dataset that only shows rows which are above a certain value, but which still uses the numbers in those rows to get the means for the supersets. using the df diamonds, i have the following code. What I want is for…
sscoresby
  • 67
  • 5
1
vote
1 answer

POWER BI adding a new column to existing calculated table based on relation

I would like to ask for your help with solving following problem: I have a table called IncomeTable with three columns: Team, Income (Value), Date (month + year) Team Income Date Sales 5000$ january 2020 Marke 2000$ march 2021 I have…
Calle
  • 151
  • 1
  • 7
1
vote
1 answer

dplyr summarize with a dynamic number of stats/conditions

I want to summarize my data in different ways, specifically, I want to count how many values are greater or equal than a certain threshold. I could easily do that with e.g. library(tidyverse) mtcars |> summarize(test1 = sum(mpg > 15, na.rm =…
deschen
  • 10,012
  • 3
  • 27
  • 50
1
vote
1 answer

How to calculate mean value in R summarize statement based on a condition?

Coffee import data of several countries Problem background *Element *col has two categorical values: Import Quantity and Import Value item col has five categorical values: Coffee Green, Coffee Extracts, Coffee husks and skins, Coffee substitutes,…
1
vote
1 answer

plot the mean on barplot without overlaping geom_text

Very simple question. I'm trying to add the means to each variable on the barplot below. Problem is: I'm not able to do that, whenever I try I get the single value for mean(varUnlist) or a bunch of duplicated values by row. By the way, are the…
Larissa Cury
  • 806
  • 2
  • 11
1
vote
3 answers

Assign most common value of factor variable with summarize in R

R noob here, working in tidyverse / RStudio. I have a categorical / factor variable that I'd like to retain in a group_by/summarize workflow. I'd like to summarize it using a summary function that returns the most common value of that factor within…
TY Lim
  • 509
  • 1
  • 3
  • 11
1
vote
1 answer

How to combine dplyr group_by, summarise, across and multiple function outputs?

I have the following tibble: tTest = tibble(Cells = rep(c("C1", "C2", "C3"), times = 3), Gene = rep(c("G1", "G2", "G3"), each = 3), Experiment_score = 1:9, Pattern1 = 1:9, Pattern2 =…
Martingales
  • 169
  • 1
  • 8
1
vote
2 answers

R dplyr: Group and summarize while retaining other non-numeric columns

I want to calculate grouped means of multiple columns in a dataframe. In the process, I will want to retain non-numeric columns that don't vary across with the grouping variable. Here's a simple example. library(dplyr) #create data frame df <-…
jeffgoblue
  • 319
  • 1
  • 3
  • 11
1
vote
1 answer

How to get a list of variables with group_by clause in R?

I trying to get a list of string values by using group_by() clause in R. Please find a sample data below. Here is what I tried. result <- data %>% group_by(station) %>% summarise(values = list(variable)) measurement_vars <- c("PRCP", "SNOW",…
Mehmet Yildirim
  • 471
  • 1
  • 4
  • 17
1
vote
2 answers

R Dplyr summarize rows except when only NAs

This is what my dataframe looks like I have a dataframe of several columns and several rows per Participant_ID. I want to sum data for all lines of Participant_ID, to obtain one value per Participant_ID. The problem is that some columns are empty…
1
vote
3 answers

Groupby, filter, summarise and then apply the result to the whole column

I am working with R and I have a series x at quarterly frequency and for which I want to extract the mean over the four quarters in 2012 and store that value in all rows of a newly created column. I have this kind of dataset date durabl services…
Fef894
  • 59
  • 3
1
vote
1 answer

combine redundant row items in r

I have a dataset with the the names of many different plant species (column MTmatch), some of which appear repeatedly. Each of these has a column (ReadSum) with a sum associated with it (as well as many other pieces of information). How do I…
salix7
  • 61
  • 5
1
vote
2 answers

How to combine across, summarize, and n() in R to get number of non-NA values by column?

I have a list of questions, and I want to know how many rows have non-NA values using summarize. I want to use summarize because I'm already using that to calculate the average, which works in the below code. Why does the below code not work and how…
J.Sabree
  • 2,280
  • 19
  • 48
1
vote
1 answer

Count string by group (R)

In annual grouping, I would like to get the number of times a string appears in multiple variables (columns). year <- c("1993", "1994", "1995") var1 <- c("tardigrades are usually about 0.5 mm long when fully grown.", "slow steppers", "easy") var2…
onlyjust17
  • 125
  • 5
1
vote
1 answer

In grouped dataframe, summarize rows which contain certain value (e.g. zeros only) in set of columns with common substring in header [R]

Given the following dataframe: dframe <- structure(list(id = c("294361-7349174-75411122", "294365-7645230-95464222", "291915-7345264-75464222", "291365-7345074-75164202", "594165-7345274-78444212", "234385-7335274-75464229",…
ramen
  • 691
  • 4
  • 20