Questions tagged [summarize]

A dplyr instruction ( actually named summarise( ) ) to create a new data frame by grouping data according to given grouping variables. Use this tag along with the dplyr version being used. Mind the spelling in the method name.

summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row (or more, as of 1.0.0) summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

836 questions
1
vote
1 answer

How to compress output of arrays of primitive type?

My json file contains arrays of ints and strings and objects. Is there a way of compressing the output of arrays that contain only ints or strings? either do not display elements of these arrays, or show type of elements and count This is what it…
Leevi L
  • 1,538
  • 2
  • 13
  • 28
1
vote
2 answers

Is ```summarize()``` contractive with ```ggplot()```

Hello everyone Here is one example code library(palmerpenguins) mypenguins = penguins %>% drop_na() penguins_df <- mypenguins %>% group_by(species) %>% summarize(mean_g = mean(bill_length_mm)) penguins_df %>% ggplot(aes(x = mean_g, y =…
KaiLi
  • 46
  • 4
1
vote
0 answers

DAX: Weighted Average on Different Level of Aggregation using Netted Summarized Table

I have the following test Data set, with 3 different levels of grouping. And I want to make a measure "Value WA" for Value, and the make a matrix like showing in the image after. Basically, it is the weighted average of "Value" based on "Market…
Alex
  • 11
  • 3
1
vote
4 answers

Aggregate and summarise character object with R

I have a breeding productivity dataset: df1 # Nest.box Obs.type individual.number Clutch Chick.status # 1 Nest1 Egg 1 First NA # 2 Nest1 Egg 2 First NA # 3 Nest1 Egg 3 First NA # 4 Nest2 Egg 1 First NA # 5 Nest2 Egg 2 First…
Andre230
  • 145
  • 1
  • 9
1
vote
1 answer

R - create summary table of means and counts by group for multiple columns

I have data which I want to group by one column and then summarise with means and counts by group for multiple columns. Some example data (my data has more columns and groups to be summarised): df <- data.frame( group = c("A", "A", "B", "B",…
Dee G
  • 133
  • 9
1
vote
2 answers

How to summarise and subset multi-level grouped dataframe in dplyr and R

I have the following data in long format: testdf <- tibble( name = c(rep("john", 4), rep("joe", 2)), rep = c(1, 1, 2, 2, 1, 1), field = rep(c("pet", "age"), 3), value = c("dog", "young", "cat", "old",…
mdb_ftl
  • 423
  • 2
  • 14
1
vote
1 answer

extracting matching variable in summarize()

I have a example data set gene_name motif_id matched_sequence A y1 CCC A y2 CCAAA A y3 AAG A y3 AT B y1 AAAA B y4 AAT C y5 AAGG and trying to get dataset like in R…
berliiiin
  • 59
  • 1
  • 6
1
vote
1 answer

R dplyr summarize_at: numeric vector of column positions results in "Can't convert a character NA to a symbol" - Summary stats output with t-test

I wish to summarize a set of data in a dataframe using dplyer. Concerning the "vars" argument, the documentation reads: A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL. I have…
Pisuke
  • 103
  • 1
  • 2
  • 11
1
vote
1 answer

How to sum values of one column, based on two conditions, grouped by another column value, in R?

I have a dataset that includes many "transects", and a multiple "transects" comprise a "plane" (e.g. Plane P1 = Transect T1 + Transect T2) The current data structure (see example below) has the length of each transect repeated in the column…
Tiff-D
  • 25
  • 5
1
vote
2 answers

Error in dplyr group_by %>% summarize_if()

I am working a fairly small dataset, attempting to summarize the columns by mean, while grouping by the first column. Currently I have a df (LitterMean) that looks as such: date3 TotalBorn LiveBorn StillBorn Mummies 1 7/6 12 12 …
Jonathan
  • 13
  • 2
1
vote
1 answer

R Group_by/Summarise not returning expected results

I have a dataset in the following format stored in a large tibble in…
Alex
  • 13
  • 3
1
vote
2 answers

is there a R function (or sequence of steps) to grouping and summarise (count) a dataframe like this (with some repeated values in the rows)

I have a df like that df = data.frame (user = c('u1', 'u1', 'u1', 'u2', 'u2'), entity = c('e1','e2','e3','e3','e4'), area = c('a1','a1','a2','a2','a1'), sex=c('M','M','M','F','F')) and i need to…
danny
  • 45
  • 4
1
vote
1 answer

Summarize(mean) across records keeping the variables that don't change

I have a dataframe that contains records from various devices that measure parameters like temperature and humidity, I'm trying to group the records in intervals of 10 mins. Example in question: id datetime hum temp room …
sandrodand
  • 35
  • 4
1
vote
0 answers

Using dot syntax in dplyr::across

This question is related to this one; it still hasn't been answered. I am trying to calculate summary statistics across multiple columns. The function dplyr::summarise_each has been deprecated so I get a warning, but it allowed me to pass in the…
gm007
  • 547
  • 4
  • 11
1
vote
2 answers

Summarize grouped character data with true NA in dplyr

I have speech data with utterances by same-speaker that I want to collapse: df <- structure(list(Line = 1:7, Speaker = c("ID01.A", NA, "ID01.C", "ID01.C", "ID01.A", "ID01.A", "ID01.A"), Utterance = c("how…
Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34