Questions tagged [summarize]

A dplyr instruction ( actually named summarise( ) ) to create a new data frame by grouping data according to given grouping variables. Use this tag along with the dplyr version being used. Mind the spelling in the method name.

summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row (or more, as of 1.0.0) summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

836 questions
0
votes
0 answers

Using the weighted.mean function, inside an lapply function, with data.table

I have the following dataset: # A tibble: 450 x 546 matchcode idstd year country wt region income industry sector ownership exporter c201 c202 c203a c203b c203c c203d c2041 c2042 c205a c205b1 c205b2 c205b3 c205b4 c205b5 c205b6 c205b7 …
Tom
  • 2,173
  • 1
  • 17
  • 44
0
votes
1 answer

(How) can I use ddply to summarize a dataframe grouped by two factors?

Short version of question: How can I use ddply to summarize my dataframe grouped by several variables? I currently use this code to summarize by Condition: ddply(ExampleData, .(Condition), summarize, Average=mean(Var1, na.rm=TRUE),…
Kastany
  • 427
  • 1
  • 5
  • 16
0
votes
1 answer

Creating summary statistics (summarise_all) for a large factor dataset, retaining factor info

I have a large dataset with observational survey data which I would like to aggregate to country-year level (also for factors), in order to use the data as country-level data in another dataset. One df that I would like to aggregate has the…
Tom
  • 2,173
  • 1
  • 17
  • 44
0
votes
1 answer

How does the Graphite summarize function with avg work?

I'm trying to figure out how the Graphite summarize function works. I've the following data points, where X-axis represents time, and Y-axis duration in ms. +-------+------+ | X | Y | +-------+------+ | 10:20 | 0 | | 10:30 | 1585 | | 10:40…
Abhijit Sarkar
  • 21,927
  • 20
  • 110
  • 219
0
votes
1 answer

ifelse() nested statements in summarize function in dplyr R

I am trying to summarise a dataframe based on grouping by label column. I want to obtain means based on the following conditions: - if all numbers are NA - then I want to return NA - if mean of all the numbers is 1 or lower - I want to return 1 - if…
MIH
  • 1,083
  • 3
  • 14
  • 26
0
votes
3 answers

Merge two tables based on 2 conditions and output the average as result column

I have the following two tables: Table_1 ID Interval 1 10 1 11 2 11 and Table_2 ID Interval Rating 1 10 0.5 1 10 0.3 1 11 0.1 2 11 0.1 2 11 …
choufrise
  • 87
  • 8
0
votes
0 answers

summarize is not taking into account group_by R

What's wrong with this code? trying to get grouped means by Parameter name which is the names of the chemicals, there are multiple arithmetic means for each chemical in the column and they are numeric and the names are characters. Q2<-daily_SPEC…
user9768042
0
votes
0 answers

How to do Summation of a vector product in language R

I would like to do the summation of product of all the elements of two vectors in R language, but something goes wrong. This is my data definition: > alpha <- 1/24 > a <- c(-5, -2, 1) > b <- c(alpha*3, alpha*2, 1-5*alpha) Then I'm trying: > result…
glc78
  • 439
  • 1
  • 8
  • 20
0
votes
2 answers

Calculate product of frequencies in each column

I have a data frame with 3 columns, each containing a small number of values: > df # A tibble: 364 x 3 A B C 0. 1. 0.100 0. 1. 0.200 0. 1. 0.300 0. 1. 0.500 0. 2. 0.100 0. 2. 0.200 0. …
Omry Atia
  • 2,411
  • 2
  • 14
  • 27
0
votes
2 answers

Conditional calculation of mean per month per year dplyr

I have large data set of stream chemistry for several streams for long periods of time (7-20 years worth of data). I want to obtain a monthly TOC value for every year for each site but there are times when there is only 1 TOC value for a given month…
BRC
  • 13
  • 3
0
votes
0 answers

Avoid automatic round up when using mean in summarize

how can I avoid automatic Round-up in mean(price) using summarize? I'd like to get the result like Results1. > data(diamonds, package="ggplot2") > head(diamonds) # A tibble: 6 x 10 carat cut color clarity depth table price x …
JennyY
  • 1
  • 1
0
votes
4 answers

Improving a simple C# program

I have just started to learn to program in C# and I have created a very simple program that is suppose to summarize all positive numbers that's inside a int array. The program looks something like this: static void Main(string[] args) { int[]…
anderssinho
  • 298
  • 2
  • 7
  • 21
0
votes
1 answer

Adding a proportion with respect to one factor in R summarized data frame

I have created a summarized data frame using R's 'summarize' function, including two factors - "Size of Firm" & "Case Status" - and number of records (n) for each combination of "Size of Firm" and "Case Status". There are three levels for size of…
J. Staak
  • 11
  • 2
0
votes
1 answer

Summarize with character type conditions in dplyr

I would like to count the number of times a country is listed alone and the times is listed with some other country. This is a section of MY DATASET: address_countries2 name_countries n_countries China 1 …
Amleto
  • 584
  • 1
  • 7
  • 25
0
votes
1 answer

How to create a variable from summarize command in Stata?

summarize X , detail gen un_p5 = di r(p5) gen un_p10 = di r(p10) gen un_p25 = di r(p25) gen un_p50 = di r(p50) gen un_p75 = di r(p75) gen un_p90 = di r(p90) gen un_p95 = di r(p95) gen un_p99 = di r(p99) I want to summarize, detail another…
user123456
  • 67
  • 1
  • 1
  • 10