Questions tagged [summarize]

A dplyr instruction ( actually named summarise( ) ) to create a new data frame by grouping data according to given grouping variables. Use this tag along with the dplyr version being used. Mind the spelling in the method name.

summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row (or more, as of 1.0.0) summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

836 questions
0
votes
3 answers

What is the correct way to calculate Mean across all variables by grouping

A <- data %>% group_by(Agent) %>% summarise(across(EP:Yt.ha),mean) The Error message is Error: Problem with summarise() input ..2. x Input ..2 must be a vector, not a function. i Input ..2 is mean. i The error occured in group 1: Agent…
Avon
  • 13
  • 2
0
votes
2 answers

R dplyr How to aggregate summarise information by keeping informations from a particular record

I want to aggregate informations (strings and numerics) by keeping the value from a particular record of my dataset. Here is an example: data <- tibble(id1 = c(1, 1, 2, 1, 2, 3), id2 = c('a', 'a', 'a', 'c', 'a', 'a'), id3 = c(1, 2, 3, 4, 5, 6),…
John E.
  • 137
  • 2
  • 10
0
votes
0 answers

Why a group_by + summarise of thousands of observations yielding a dataframe 1x1?

Please, consider the following: edx2 is a dataframe with 9000055 obs. of 8 variables; movieId is a column of edx2 with more than 10,000 levels; rating is a # column of edx2; mean(edx2$rating) is a valid number. naive is a # constant; My objective…
0
votes
1 answer

How do I sum and label the total of "Confirmed" Cases per x-axis "Date"

I want to label the sum of the column of "Confirmed" Cases for each "Date" ggplot(filter(COVID1, COUNTY %in% c("Kent")))+ geom_col(aes(x = Date, y = Cases, fill = CASE_STATUS), position = position_stack(reverse = TRUE), width = .88)+ dp(COUNTY …
John
  • 45
  • 6
0
votes
1 answer

Summing values when merging rows in a data set in R

So I have a large data set (50,000 rows and 500 columns). I merged the rows I wanted to by this code: Similarities <- Home %>% group_by_at(c(1,2,5,9,70,26)) %>% summarize_all(.funs = function(x) paste(unique(x), collapse = ',')) In this code,…
Anna
  • 3
  • 2
0
votes
1 answer

How to search across multiple excels for values and summarize to one excel workbook

I have 1000 excel workbooks and I have to summarize data in one excel workbook. Each workbook consists of data of one property (id of property, region, market value etc.) In the summary workbook I want to insert in a column the id of property and…
Tsorts
  • 31
  • 3
0
votes
1 answer

Finding the range based on minimum values that increases

I have a dataset with multiple stations, depths and concentration. I am trying to find the difference in depth (or the thickness) based on where the minimum concentration increases by 0.1 For example: At station 1, the maximum depth is 14m. There is…
L55
  • 117
  • 8
0
votes
2 answers

Summarizing one way, then another for what's left

Using iris as an example. After grouping by Species, I want to summarize Sepal.Length by its mean, then summarize all the remaining columns by last; (without calling out the remaining columns individually.) Wanting the result # A tibble: 3 x…
David T
  • 1,993
  • 10
  • 18
0
votes
1 answer

How to generate a var to capture count total number with if condition in r

I have a data set looks like this: library(data.table) dt <- data.table(id = c("A", "A", "A", "B", "B", "B", "C", "C", "C"), Complete = c("Yes","No","Yes","Yes","No","Yes","Yes","Yes","Yes")) > dt id Complete 1: A Yes 2: A No 3: A …
Stataq
  • 2,237
  • 6
  • 14
0
votes
2 answers

R/dplyr: Summarize data without grouping it

I have a data frame like this: ID V1 V2 A 2 June B 3 May A 2 January F 4 December I want to add V3 that gives me the number of entries by earliest V2 within each ID: ID V1 V2 V3 A 2 June January B 3 May May A 2 …
questionmark
  • 335
  • 1
  • 13
0
votes
2 answers

How to create a vector of variables using summarise?

Here is a beginner's problem: I like using summarise(), but sometimes I find it difficult storing the results. For example, I know I can store 1 value in the following manner: stdv <- Data %>% filter(x == 1) %>% summarise(stdv = sd(y)) But I get…
0
votes
1 answer

Aggregate (Summarize) multiple Time Series Data by Month in .r

I have hundreds of daily weather data with a .txt extension, with comma (",") as separators in common folders. Each file has the same data structure with different file names. The following is an example of the data structure: $ year : int …
0
votes
1 answer

group_by, using different functions for different columns based on their name

I have a large df with many columns (+100), some of which have a name that ends in "_e" and others in "_se". I want to summarize these variables using the label of the column sign. For those columns that ent in "_e", I would like to have the sum of…
0
votes
1 answer

Why is the summarize() function in the "srvyr" package outputing a tibble of lists?

I'm trying to use the "srvyr" package in R to analyze American Community Survey Public Use Microdata (PUMS). I'm using a script an old colleague gave me as a base and trying to piece my way forward. Because I've seen the script in action before--and…
paul_r
  • 1
0
votes
1 answer

Why won't summarize_if work with bind_rows in this example?

I am trying to use bind_rows and summarize_if to add a total bottom row in a sample data set. There are different postings that are related to this type of question, but not exactly my issue. Additionally, some posted questions have so much other…
DaveM
  • 664
  • 6
  • 19