Questions tagged [summarize]

A dplyr instruction ( actually named summarise( ) ) to create a new data frame by grouping data according to given grouping variables. Use this tag along with the dplyr version being used. Mind the spelling in the method name.

summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row (or more, as of 1.0.0) summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

836 questions
1
vote
1 answer

R function that summarize rows (grouped_by) but disregards duplicate strings and NA

very grateful for your help trying to group/collapse rows of all columns (all but the two columns I use to group_by) and would like to exclude duplicated strings in the merge (only keep distinct strings and numbers). The df has many many more…
Sanna
  • 11
  • 2
1
vote
2 answers

How to summarize several independent variables at once in R?

For example, if the data is like below, Cultivar=rep(c("CV1","CV2"),each=12) Nitrogen=rep(rep(c("N0","N1","N2","N3"), each=3),2) Block=rep(c("I","II","III"),8) Yield=c(99,109,89,115,142,133,121,157,142,125,150,139,82,104,99,117, …
Jin.w.Kim
  • 599
  • 1
  • 4
  • 15
1
vote
3 answers

Count occurrence of value in repeated measure

Hi I have the dataset below: ID <- c(1,1,1,2,2,3,3,3,4,4,4) diagnosis <- c("A","A","B","C","C","B","A","A","C","C","B") df <- data.frame(ID,diagnosis) ID diagnosis 1 A 1 A 1 B 2 C 2 C 3 B 3 A 3 A 4 C 4 C 4 B I would like to count how…
Bruh
  • 277
  • 1
  • 6
1
vote
1 answer

How to sum rows of selected columns and rbind to a similar formatted dataset

I have a dataset that looks like this. Day|Population|Red|Yellow|Orange|Green 1 30 15 3 4 8 2 50 10 30 5 5 3 10 3 6 1 0 4 25 2 10 10 3 I…
user35131
  • 1,105
  • 6
  • 18
1
vote
2 answers

Aggregate, dcast and create new columns in R

I have a data frame for every second. From the data frame with 1 second interval, I managed to aggregate the data into 1-minute interval using the following code: agg_cont <- df %>% group_by(Date, Hour, Minute, Status, Mean) %>% count(name =…
Karthik
  • 117
  • 7
1
vote
1 answer

R: simultaneously summarize several variables by changing the aggregation function according to the type entered in a metadata table

I've got a df with several variables, and and I want to make simultaneously summarized functions but differentiated according to the type of the variables. The difficulty is that I want to use the variable type information from another metadata df…
pgourdon
  • 139
  • 7
1
vote
1 answer

Group by and summarize percentage based on dichotomous variable

Hi I have this dataset here: diagnoses_2_or_more and diagnoses_3_or_more are categorical where 1 indicates yes and 0 indicates no. id <- c(1,2,3,4,5,6,7) grp <- c("1","1","1","2","2","3","3") diagnosis_2_or_more <-…
Bruh
  • 277
  • 1
  • 6
1
vote
2 answers

Count multiple columns with categorical data after being grouped by another column

I have a large data set (49000 X 118) and what I would like to do is I want to group by one column then have the summary of multiple columns. The issue with my data is that the summary of each column has a different length. Here is a simple example…
Chi
  • 13
  • 3
1
vote
3 answers

Summarizing and grouping rows in R

I'm and R novice and I'm currently working with a data frame in R that looks like the following. City p54_1 p54_2 p54_3 p54_4 p54_5 p54_6 p54_7 p54_8 p54_9 p54_10 p54_11 p54_12 19 Apodaca 0 0 1 1 1 1 1 0 0 …
1
vote
2 answers

Aggregating character and numeric data from two dataframes into a new dataframe

I'm trying to create a new dataframe that joins the names of some columns and their values. My inputs looks like this: input1 = structure(list(Date = structure(c(1677502800, 1677502800, 1677502800, 1677502800, 1677502800, 1677502800), class =…
1
vote
0 answers

function summarise with dplyr

I have a large job-exposure database, and I wanted to calculate the duration of exposure of each subject to each agent. But a subject can be exposed to an agent through different jobs. For each job, I have the start year and end year. There are…
R_help
  • 25
  • 5
1
vote
2 answers

Collapse / Merge multiple rows with non empty cells/ values

I am trying to merge two rows by a similar group which I did by looking at different questions on stack overflow (Question1, Qestion2, Question3). All these questions stated what I want but I also have some empty fields in my data frame and I don't…
Usman YousafZai
  • 1,088
  • 4
  • 18
  • 44
1
vote
1 answer

Dyplr summarise across output as rows?

I would like to generate overview tables for the same statistics (e.g., n, mean, sd) across multiple variables. I started with combining the dyplr summarise and across function. See follwing example: df <- data.frame( var1 = 1:10, var2 =…
Kilsen
  • 45
  • 5
1
vote
1 answer

remove specific data within the string in R

im new to R, i have this data frame and im trying to delet all the infromation from this column except the genes symbols which always comes secound in place within the string. enter image description here best regards! i tried this function (gsub)…
Yamen Wm
  • 13
  • 2
1
vote
0 answers

Calculate new Power BI table from multiple tables using UNION, SUMMARIZE and FILTER

Hi I am new to Power Bi and I am trying to generate a set of summary tables which combine data from multiple tables. New Table= ADDCOLUMNS( UNION( SUMMARIZE( FILTER( 'Purchased', …
MicrobicTiger
  • 577
  • 2
  • 5
  • 21