Questions tagged [summarize]

A dplyr instruction ( actually named summarise( ) ) to create a new data frame by grouping data according to given grouping variables. Use this tag along with the dplyr version being used. Mind the spelling in the method name.

summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row (or more, as of 1.0.0) summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

836 questions
1
vote
1 answer

custom function does not work on column named "x" unless specified by .$x in summarise() dplyr R

I wanted to create a custom function to calculate confidence intervals of a column by creating two columns called lower.bound and upper.bound. I also wanted this function to be able to work within dplyr::summarize() function. The function works as…
1
vote
1 answer

Simulink: Vector summation and saving the output to workspace

I can not solve a very simple problem in the Simulink: summation of 2 equal size vectors and writing the result into the Matlab workspace. The trivial operation that takes 1 line in Matlab seems a real problem in the simulink. I have 2 vectors with…
Abracadabra
  • 201
  • 1
  • 10
1
vote
2 answers

Summarizing grouped and ungrouped data in a single tibble with tidyverse in R

To summarize each variable in my data, I typically create two tables: one that groups by experimental condition and a second that displays aggregated statistics across all experimental groups. However, I'd like to display both grouped and aggregated…
1
vote
2 answers

Performing operations on dplyr summaries

Assume we have some random data: data <- data.frame(ID = rep(seq(1:3),3), Var = sample(1:9, 9)) we can compute summarizing operations using dplyr, like this: library(dplyr) data%>% group_by(ID)%>% summarize(count =…
Ryan
  • 1,048
  • 7
  • 14
1
vote
2 answers

R: Using doBy with Dates

I am doing some coding in R. I am trying to use the doBy package to get a sum total score for a variable (x) by both date (date) and by id (id). The doBy command works fine and I get this output. data id date x 1 01/01/2021 1 1 01/02/2021…
1
vote
3 answers

tidyverse: append rows of totals in summary output

I want append rows of totals in the output of summarise used with group_by. Data <- structure(list(CT = c("1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2"), SCT = c("1", "1", "1", "1",…
MYaseen208
  • 22,666
  • 37
  • 165
  • 309
1
vote
3 answers

Convert Daily Data into Weekly Data and summarize multiple columns in R

I want to change the following data set : date A B 01/01/2018 391 585 02/01/2018 420 595 03/01/2018 455 642 04/01/2018 469 654 05/01/2018 611 900 06/01/2018 449 640 07/01/2018 335 522 08/01/2018 726 955 09/01/2018 676…
1
vote
2 answers

Using summarize across with multiple functions when there are missing values

If I want to get the mean and sum of all the numeric columns using the mtcars data set, I would use following codes: group_by(gear) %>% summarise(across(where(is.numeric), list(mean = mean, sum = sum))) But if I have missing values in some of…
Anup
  • 239
  • 2
  • 11
1
vote
2 answers

R: What is the expected output of passing a character vector to dplyr::all_of()?

I am trying to understand the expected output of dplyr::group_by() in conjunction with the use of dplyr::all_of(). My understanding is that using dplyr::all_of() should convert character vectors containing variable names to the bare names so that…
socialscientist
  • 3,759
  • 5
  • 23
  • 58
1
vote
1 answer

Create new variable that summarizes observation given a certain condition

Hello I'm new to R and I dont understand why my following approach does not work. I have this df1 that looks somethig like this: view duration_hours date 1 a 5 2021-03-29 2 a 7 2021-03-29 …
Oliver
  • 39
  • 3
1
vote
2 answers

Issues with combining a case when into a cumsum calculation in R

Below is the sample data and my attempt at this. My primary question is how I would get the smbsummary3 data frame to show values of small = 2, 3, or 4 when they do not exist in the source data. My summarise section calculates correctly. Do I need…
Tim Wilcox
  • 1,275
  • 2
  • 19
  • 43
1
vote
1 answer

How to create two columns that count the total number of two conditions

I have a diabetes dataset that has a column called Outcome and only has two values, 1 = Diabetes, 0 = Non-Diabetes. I want to count the total number of 1's and 0's based on age and then have a % of 1's based on age. I have this code below: by_age1…
1
vote
2 answers

dplyr summarize by preferred string value

I have a data frame with IDs and string values, of which some I prefer over others: library(dplyr) d1<-data.frame(id=c("a", "a", "b", "b"), value=c("good", "better", "good", "good")) I wand to handle that equivalent to the following…
aldorado
  • 4,394
  • 10
  • 35
  • 46
1
vote
1 answer

Rearrange dplyr groupby output with exactly two factors?

I'm finding this problem hard to search about because the terms summarize, groupby, rearrange, table are just so generic. What I'd like to do is summarize a value after grouping by exactly two factors, and put the result in a table with rows/columns…
dim fish
  • 469
  • 1
  • 4
  • 12
1
vote
1 answer

Issues with creating gt table after adding new fields

Below is the sample data, packages, and the manipulations. Part 3 and Part 4 are where the core question lies The goal here is to produce a table that has the employment by smb category and time period. If I leave part 3 out and the col_order item,…
Tim Wilcox
  • 1,275
  • 2
  • 19
  • 43