Questions tagged [summarize]

A dplyr instruction ( actually named summarise( ) ) to create a new data frame by grouping data according to given grouping variables. Use this tag along with the dplyr version being used. Mind the spelling in the method name.

summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row (or more, as of 1.0.0) summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

836 questions
0
votes
1 answer

Why does a mutate following a group_by(year, month) seem to miss a row?

I have a data frame of daily periodicity that I am converting to monthly periodicity included a simple transformation based on the summarized values: tibble( date = ymd("2002-12-31") + c(0:60), index = 406 * exp(cumsum(rnorm(61,0,0.01))) ) %>%…
0
votes
2 answers

Summarise a column and thereby remove unwanted NAs in others

Once again I'm a little stuck and reaching out for help. I hope one day being able to give this help back... Anyways, I have a tibble that looks like this: # A tibble: 20 x 6 # Groups: tipologia [6] tipologia …
Robin Kohrs
  • 655
  • 7
  • 17
0
votes
0 answers

Summarize after group_by dplyr returns single value

I'm new with dplyr package, hopefully the question is not too silly. Take the data.frame test= data.frame(aoi_id = c(15651,19975,15998,15842, 15651,19975,15998,15842), ge_id = c(1, 1, 1, 1, 2, 2, 2, 2), ADJSTK = c(50, 54, 56, 50)) I want to…
Greg
  • 1
0
votes
1 answer

str_extract() and summarise() gives me na row

This should be pretty straightforward, as think I'm just looking for verification about what I'm seeing. I'm trying to use str_extract() to pull areas of interest out of a column in my data frame, and then count how often each word appears. I'm…
pkpto39
  • 545
  • 4
  • 11
0
votes
1 answer

summarize from two differents rows

This is my starting df test <- data.frame(year = c(2018,2018,2018,2018,2018), source = c("file1", "file1", "file1", "file1", "file1"), area = c("000", "000", "800", "800", "800"), cult2 =…
krifur
  • 870
  • 4
  • 16
  • 36
0
votes
2 answers

R - group by and summarise categorial vars (top 2 with count)

I Need to group a data.frame by field A and summarise categorical var B, keeping its top 2 values with respective counts. There are duplicate values for B. Example data: ## double to have duplicate values mtcars2 <- rbind(mtcars, mtcars) Example…
0
votes
1 answer

dplyr: group_by, sum various columns, and apply a function based on grouped row sums?

I'm trying to use dplyr to summarize a dataframe of bird species abundance in forests which are fragmented to some degree. The first column, percent_cover, has 4 possible values: 10, 25, 50, 75. Then there are ten columns of bird species counts:…
0
votes
1 answer

Summarise(across(where)) in R

I have the following data file called "data_2": Person Weight Height A 55 155 B 65 165 C 75 175 I wanna use the Summarise(across(where))-command in order to generate the total weight and the weight for each person. This…
John Doe
  • 37
  • 8
0
votes
1 answer

my group_by and mutate functions are not properly working

I want to use the max function for each Item in my dataframe, but when I use group_by, it outputs the Items for each Area. I should just have a distinct set of Items though-not duplicate Items. When I replace mutate with summarize, the output is…
dumjo
  • 1
  • 2
0
votes
1 answer

Performing operation among levels of grouped variable in R/dplyr

I want to perform a calculation among levels a grouping variable and fit this into a dplyr/tidyverse style workflow. I know this is confusing wording, but I hope the example below helps to clarify. Below, I want to find the difference between levels…
Kodiakflds
  • 603
  • 1
  • 4
  • 15
0
votes
1 answer

R check and count Strings in a vector, group_by, considering order of appearance of the strings

The data is in the following format, where i have to group_by it using Date. For convenience i have shown it as numbers. Msg <- c("Errors","Errors", "Start","Stop","Start","Stop","Errors","Errors","Start","Stop", "Stop"…
Ram
  • 69
  • 7
0
votes
1 answer

R Help! Calculate the proportion per subgroup

I have the following dataset, called GrossExp3, covering the bilateral Exports (in 1000 USD) of 15 reporter Countries for all Years from (1998 – 2018) to all available partner countries It covers the following four variables: Year, ReporterName (=…
Melike
  • 15
  • 4
0
votes
2 answers

java stream Collectors without mapper?

I want to call Collectors.summarizingInt on a set with Integers. Examples I have seen so far are on a Set with (say) Employees and are then called as collect(Collecters.summorizingInt(Employee::getWage)). For the bare Integers summorizingInt needs…
dr jerry
  • 9,768
  • 24
  • 79
  • 122
0
votes
1 answer

summarize(n()) and count() difficulties in R

This problem is driving me crazy and I can't figure it out. Here is a subset of my dataframe (df) to make things easier. I want to group_by sex and count the total. Simple? df %>% group_by(sex) %>% count() This code returns the following…
p.habermanm
  • 85
  • 1
  • 9
0
votes
1 answer

New variable, within groups, sum for each observation

I have trade data for year, months, commodity, and quantity. I want to create the total quantity, x_total, per commodity, per month, per year and have it appear as a new variable with the same number for each observation within that group. For…
hraw45
  • 3
  • 3