0

Currently doing some analysis on an mpg data set that I believe exists within tidyverse. I am trying to take the large data set, and combine rows to look like the smaller one below.

I have tried summarising to combine like model and years to get to the small table below, but can not figure out how to smoothly do this with cty being an average.

library(tidyverse)

mpg %>%
    group_by(manufacturer, model, year, cty) %>%
    select(manufacturer, model, year, cty) %>%
    summarise(n_model = n()) %>%
    print(n = 15)
# Output
# A tibble: 172 x 5
# Groups:   manufacturer, model, year [76]
   manufacturer model       year   cty n_model
   <chr>        <chr>      <int> <int>   <int>
 1 audi         a4          1999    16       1
 2 audi         a4          1999    18       2
 3 audi         a4          1999    21       1
 4 audi         a4          2008    18       1
 5 audi         a4          2008    20       1
 6 audi         a4          2008    21       1
 7 audi         a4 quattro  1999    15       1
 8 audi         a4 quattro  1999    16       1
 9 audi         a4 quattro  1999    17       1
10 audi         a4 quattro  1999    18       1
11 audi         a4 quattro  2008    15       1
12 audi         a4 quattro  2008    17       1
13 audi         a4 quattro  2008    19       1
14 audi         a4 quattro  2008    20       1
15 audi         a6 quattro  1999    15       1
# … with 157 more rows
# Looking for this
manufacturer model year     avg_cty     n
audi          a4    1999    18.25000    4
audi          a4    2008    19.66667    3

The expected is the smaller table, which combines like model and years, as well as an average city gas mileage (and a count). Any help is appreciated!

PageSim
  • 143
  • 1
  • 1
  • 8

1 Answers1

0

Using data.table

library(data.table)
setDT(mpg)[, .(city = mean(cty), n_model = .N), .(manufacturer, model, year)]
akrun
  • 874,273
  • 37
  • 540
  • 662