Questions tagged [summarize]

A dplyr instruction ( actually named summarise( ) ) to create a new data frame by grouping data according to given grouping variables. Use this tag along with the dplyr version being used. Mind the spelling in the method name.

summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row (or more, as of 1.0.0) summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

836 questions
1
vote
2 answers

TOTAL VALUE IN POWERBI TABLE AFTER GROUP BY

I have a table like this: I would like to obtain a single row for each combination of product, category and price (I'm not interested in the aggregation per sub product). All the product in each category have the same price. I tried to do a group by…
1
vote
0 answers

DAX -- Cannot get TOPN formula to give me sum of the top n

I've spend hours pouring over documentation on SUMMARIZE, SUMMARIZECOLUMNS, ADDCOLUMNS AND TOPN, and I just cannot get this simple calculation to come out correctly. I've even looked at results with DAX Studio, and it's always wrong. I'm trying to…
Freond
  • 64
  • 10
1
vote
1 answer

Binding a list of summary data to a data.frame creates an unknown column in R

I have a large df (+100k rows, see snapshot of data below) that I'm trying to summarize (min, mean, median, max, etc.) a variable (salinity) in a table by group (species) using tapply, but if I use the whole dataset (which contains a few NA's, but…
Nate
  • 411
  • 2
  • 10
1
vote
2 answers

More efficient way of using group_by > mutate > slice

I have a dataframe that looks like this df <- data.frame("Month" = c("April","April","May","May","June","June","June"), "ID" = c(11, 11, 12, 10, 11, 11, 11), "Region" = c("East", "West", "North", "East", "North" ,"East", "West"), "Qty" = c(120, 110,…
FinRC
  • 133
  • 8
1
vote
1 answer

How can I pivot wider and summarise the duplicates (R)?

I wanted to transpose my table creating new columns by client. The rows would be by idMarket and Section, and the other columns would give the Score of each client in those Markets and Section. I want the summarise in each column if there is a…
Clara
  • 111
  • 4
1
vote
1 answer

Data with multiple rows per observations with variables populated in some but not other rows

So I have this data frame: dat1 <- data.frame(id=1:n, group=rep(LETTERS[1:2], n/2), age=sample(18:30, n, replace=TRUE), type=NA, op=factor(paste0("op", 1:n)), …
1
vote
6 answers

How to summarise values in a column with non-exact match in R?

I have a data.table with over ten thousand of rows. I want to count in one column how many times a variable appears, but I want to use non-exact match. The data looks like this: dt1 <- data.table (place = c("a north", "a south", "b south", "a…
Besz15
  • 157
  • 7
1
vote
3 answers

How to sum ID within a DF by area in R?

I have a dataframe of crash statistics called crashes_TA. The datafame looks like the following but on a much larger scale with each row representing a crash. The dataframe is called…
1
vote
1 answer

Summary code that counts the number of values less than specified numbers

I have a simple dataset (that I've titled 'summary') that includes a numeric column of values. I want to create code to summarize the number of rows less that specific values, such as 5, 10, 20, 30, etc. Here is some of the…
new2data
  • 117
  • 7
1
vote
2 answers

Summarise and group_by not working with factor variables

I'm currently using the tidyverse package version 1.3.1, and when I run the following code: data <- data.frame(gender = c(1,2,1,2,2,2,2,1,2,1), age = c(18,20,21,24,25,24,24,25,22,21)) data <- data%>% mutate(gender = factor(gender, levels =…
1
vote
0 answers

Can I export tab, summarize() table from Stata to LaTeX?

I was wondering if there is a way to export tab, summarize() table to LaTeX. I have tried eststo col: estpost tab and tabout but error shows option summarize() not allowed.
l_sq
  • 11
  • 1
1
vote
1 answer

How to combine summarize_at and custom function that requires input from multiple columns in R?

I have a list of employees actual capacity (which changes each month) and their scheduled capacity (which is constant every month). I want to use summarize_at to tell what percentage they are over (or under) their allocation. However, I can't figure…
J.Sabree
  • 2,280
  • 19
  • 48
1
vote
2 answers

Count unique occurrences of factor levels and numeric values with dplyr, on data in a long format

I have data on repeated measurements of 8 patients, each with varying amount of repeated measurements on the same variables. The measured variables are sex, blood pressure (sys_bp), and how many CT scans a person…
tcvdb1992
  • 413
  • 3
  • 12
1
vote
1 answer

R underperforming on dplyr summarize with multiple joins and filters

I have the following example dataset containing 3 dataframes: base_pop_ex <- structure( list( anon_id = c( "0003ff12-03b1-42b9-86cf-4b7c05e3e3a7", "0003ff12-03b1-42b9-86cf-4b7c05e3e3a7" ), session_number =…
agustin
  • 1,311
  • 20
  • 42
1
vote
2 answers

How to use dplyr::summarize multiple times in a single command in R dplyr/ tidyr?

I have a community of species 1,2,3, and 4. I am trying to compute the covariance between species i and combined abundances of reciprocal species using dplyr. I want to do this for each species combination. The dplyr works fine for just one species,…
Rspacer
  • 2,369
  • 1
  • 14
  • 40