Questions tagged [summarize]

A dplyr instruction ( actually named summarise( ) ) to create a new data frame by grouping data according to given grouping variables. Use this tag along with the dplyr version being used. Mind the spelling in the method name.

summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row (or more, as of dplyr 1.0.0) summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

836 questions

vote

2 answers

Efficient way to create a dataframe with multiple summary columns based on a grouped dataframe using dplyr in R

I have a dataframe similar to this dummy: dframe <- structure(list(id = c("294361-7349174-75411122", "294365-7645230-95464222", "291915-7345264-75464222", "291365-7345074-75164202", "594165-7345274-78444212", "234385-7335274-75464229",…

r dplyr summarize

asked Sep 20 '22 at 20:48

ramen

vote

1 answer

Smooth multiple columns of dataset with summarize

I am trying to smooth a data by rounding the variable "depth" and then apply the function summarize on the given dataset. mean_safely <- possibly(.f = mean, otherwise = NA) SdesGG <- SdesGG %>% filter(., depth > 2) %>% mutate(depth = round(depth,…

r dplyr summarize

asked Sep 18 '22 at 15:00

C. Guff

vote

1 answer

Group_by ID, the keep row with attribute R

Table: ID <- c("01", "01", "02", "02) Accept_Medicare <- c("Opt-out", "Accept", "Opt-Out", "Accept") Data <- c("yes", "no", "no", "no") I have a dataset with multiple of the same ID, and a column "Accept_Medicare." I want to deduplicate the data…

r summarize across

asked Sep 14 '22 at 00:03

benzinga

vote

0 answers

How to apply function with multiple outputs on each group in R and store results in different columns?

Suppose I am using panel data: for each individual and time, there is an observation of a numerical variable. I want to apply a function to this numerical variable but this function outputs a vector of numbers. I'd like to apply this function over…

r dplyr time-series panel-data summarize

asked Aug 30 '22 at 16:49

Raul Guarini Riva

vote

2 answers

Kusto: Self join table and get values from different rows

Working with a similar dataset as below, I am able to get the desired output by using scan operator, to fill forward strings/bools in test dataset, however it's timing out for larger datasets, as every property has many events and there are millions…

azure-data-explorer kql summarize kusto-explorer

asked Aug 13 '22 at 17:45

Sahil Raj

vote

1 answer

Difference between .groups argument and ungroup() in dplyr?

I'm looking at some code: df1 <- inner_join(metadata, otu_counts, by="sample_id") %>% inner_join(., taxonomy, by="otu") %>% group_by(sample_id) %>% mutate(rel_abund = count / sum(count)) %>% ungroup() %>% select(-count) This first…

r dplyr group-by summarize

asked Aug 03 '22 at 00:17

Antonio

vote

1 answer

Looking for an R function that counts number of times two columns appear together

I have a data.frame with many rows. I am trying to produce a new data.frame summarizing the total row count for all combinations of V_ID and N_ID. In the below, df1 is an example of my data and df2 is an example of the desired output. df1 <-…

r dplyr summarize

asked Jul 22 '22 at 19:31

E Norton

vote

2 answers

Collapse and summarize while maintaining most frequent character variable by group

I have a data frame: df <- data.frame(resource = c("gold", "gold", "gold", "silver", "silver", "gold", "silver", "bronze"), amount = c(500, 2000, 4, 8, 100, 2000, 3, 5), unit = c("g", "g", "kg", "ton", "kg", "g", "ton", "kg"), price = c(10, 10,…

r summarize

asked Jul 18 '22 at 12:48

Anton

vote

1 answer

calculating count for a column of dates

I want to calculate the mean and standard deviation for the number of dates (or visits) that people have. Sample data are: id date 1 2015-02-23 1 2015-04-24 2 2018-05-23 2 2022-12-05 2 2022-12-06 3 2021-05-21 ID1 has 2 visits…

r count summarize

asked Jun 28 '22 at 19:44

D. Fowler

vote

0 answers

Conditionally concatenate strings in R / tidyverse

I have a dataset that is structured like this: book chapter verse text 1 1 1 string1 1 1 2 string2 1 2 1 string3 1 2 2 string4 2 1 1 string5 2 1 2 string6 2 2 1 string7 2 2 2 string8 And my intended output…

r group-by concatenation tidyr summarize

asked Jun 24 '22 at 23:07

Laurin Kub

vote

1 answer

group_by() and summarise() keeping the values without grouping

I want to summarise values of two created groups and keep the values of the total sample. What I have so far: data <- structure(list(big_four = c(0L, 0L, 0L, 1L, 1L, 0L), idade_em_2022 = c(46L,38L, 40L, 23L, 27L, 27L), total_de_cooperados = c(8665L,…

r dplyr group-by summarize

asked Jun 22 '22 at 05:34

RxT

vote

1 answer

How best to calculate relative shares of different columns in R?

Below is the sample data and code. I have two issues. First, I need the indtotal column to be the sum by the twodigit code and have it stay constant as shown below. The reasons is so that I can do a simple calculation of one column divided by the…

r dplyr summarize

asked Jun 14 '22 at 18:46

Tim Wilcox

1,275
2
19
43

vote

2 answers

R dplyr summarise mean and stdev using group_by

I have a dataframe that looks like this: df <- data.frame("Experiment" = c(rep("Exp1", 6), rep("Exp2", 5), rep("Exp3", 4)), "Replicate" = c("A","A","A","B","C","C","A","A","B","B","C","A","B","B","C"), "Type" =…

r dplyr summarize

asked May 19 '22 at 05:39

Jen

vote

1 answer

How to put the results of the summarise() function into the dataframe, using r?

This question is from (how to put the results of summarise() function into the dataframe in r) in the previous question, I think I did not convey my question well. so, I added more details. I made a minimal reproducible example, but my real data is…

r summarize

asked May 12 '22 at 03:20

yoo

vote

1 answer

How to dissolve the dataset on multiple conditions - R

Consider dataset the following dataset: ID Start time End…

r filter summarize

asked Apr 29 '22 at 11:34

Foeke Boersma

Prev 1 2 3

…

55 56 Next