Is there a way to summarize values grouped by years while keeping the index?

Question

I tried to summarize values of different years which are assigned to specific IDs.

I used dplyr to summarize it but did not find a way to keep the index.

My data looks something like this:

year <- c(2015, 2015, 2015, 2016, 2016, 2017, 2017, 2018, 2018, 2018, 2018, 2019, 2019)
index <- c(1,1,1,1,1,1,1,2,2,2,2,2,2)
value <- c(5,7,3, NA,9,14, 15, 8, NA, 9, 10, 6, 4)
df1 <- data.frame(year, index, value)

And that is the way i summarized the data:

sum1 <-
  df1 %>%
  group_by(year) %>%
  summarise(value = sum(value, na.rm = T))

I'd like to get an outcome like:

year1 <- c(2015, 2016, 2017, 2018, 2019)
index1 <- c(1, 1, 1, 2, 2)
value1 <- c(15, 9, 29, 27, 10)
df2 <- data.frame(year1, index1, value1)

Thanks, I really appreciate your help!

Groupby both "index" and "year", instead of grouping by just "year" — Ajay, Apr 04 '23 at 08:06

GKi · Accepted Answer · 2023-04-04T08:17:35.780

You can use aggregate:

aggregate(value ~ ., df1, sum)
#  year index value
#1 2015     1    15
#2 2016     1     9
#3 2017     1    29
#4 2018     2    27
#5 2019     2    10

Or using your code, adding index in the group_by.

library(dplyr)

df1 %>%
  group_by(year, index) %>%
  summarise(value = sum(value, na.rm = T))
## A tibble: 5 × 3
## Groups:   year [5]
#   year index value
#  <dbl> <dbl> <dbl>
#1  2015     1    15
#2  2016     1     9
#3  2017     1    29
#4  2018     2    27
#5  2019     2    10

Is there a way to summarize values grouped by years while keeping the index?

1 Answers1