-1

I have a dataset with donations made to different politicians where each row is a specific donation.

donor.sector <- c(sector A, sector B, sector X, sector A, sector B)
total <- c(100, 100, 150, 125, 500)
year <- c(2006, 2006, 2007, 2007, 2007)
state <- c(CA, CA, CA, NY, WA)
target_specific <- c(politician A, politician A, politician A, politician B, politician C)
dat <- as.data.frame(donor.sector, total, year, target_specific, state)

I'm trying to get a year mean of donations for each politician. And I'm able to do so by doing the following:

library(dplyr)
  new.df <- dat%>%
  group_by(target_specific, year)%>%
  summarise(mean= mean(total))

My issue is that since I'm grouping this the outcome only has three variables: mean, year and target specific. Is there a way by which I can do this and create a new data frame where I keep the politician level variables, such as state?

Many thanks!

AntVal
  • 583
  • 3
  • 18

2 Answers2

1

There are two ways in which you can do that :

Include the additional variables in group_by :

library(dplyr)

dat%>%
   group_by(target_specific, year, state)%>%
   summarise(mean= mean(total))

#  target_specific  year state  mean
#  <chr>           <dbl> <chr> <dbl>
#1 politician A     2006 CA      100
#2 politician A     2007 CA      150
#3 politician B     2007 NY      125
#4 politician C     2007 WA      500

Or keeping the same group_by structure you can include the first value of additional variable.

dat%>%
  group_by(target_specific, year)%>%
  summarise(mean= mean(total), state = first(state))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

In base R, we can use aggregate

aggregate(total ~ ., subset(data, select = -donor.sector), mean)
akrun
  • 874,273
  • 37
  • 540
  • 662