0

I have a data frame pop.subset <-:

state  location   pop
WA     Seattle    100
WA     Kent       20
OR     foo        30
CA     foo2       80

I need the city in each state with the lowest population stored in a data.frame. I have:

result <- pop.subset %>% 
          group_by(state) %>%
          summarise(min = min(pop))

This returns the data.frame:

state   min
WA      20
...    .... etc

But I need the city too. I tried including location in the group_by function like so: group_by(state, location), but then this gives the min each city paired with a state instead of the state with the city like so:

state location pop
WA    Seattle  100
WA    Kent     20
foo   foo      foo

Is there is a simple solution I missing? I want my result to be like so:

state location pop
WA    Kent     20
...   ...      ... etc.
neilfws
  • 32,751
  • 5
  • 50
  • 63
sushi
  • 274
  • 1
  • 4
  • 13
  • Can you edit this question so as the code and data match. Currently you have `State`, `Location` and `Pop` in the data frame, but `state` (lower-case 's'), `location` (lower-case 'l') and `both_sexes_2012` (how does that relate to `Pop`?) in the code. – neilfws Oct 24 '17 at 04:11
  • Oh sorry I copied some old code, fixed it! – sushi Oct 24 '17 at 04:16

3 Answers3

1

I think you want to group by state, then filter for min(pop):

pop.subset %>% 
  group_by(state) %>% 
  filter(pop == min(pop)) %>%
  ungroup()

# A tibble: 3 x 3
  state location   pop
  <chr>    <chr> <int>
1    WA     Kent    20
2    OR      foo    30
3    CA     foo2    80
neilfws
  • 32,751
  • 5
  • 50
  • 63
  • you're totally right! thanks I am not sure why I didn't see that. Tried to over complicate it with the summarise function. – sushi Oct 24 '17 at 04:34
  • It would seem intuitive to use `summarise`, the key thing is the grouping. If you group on A + B then the summary is for A + B, not for B. – neilfws Oct 24 '17 at 04:38
0

Did you try something like this?

result <- pop.subset %>% 
              group_by(state, location) %>%
              summarise(min = min(both_sexes_2012))
Rolando Tamayo
  • 286
  • 2
  • 8
  • I tried that but then it pairs states and cities together instead of the minimum city within each state. state location pop. E.g. returns WA, Seattle 100 and WA, Kent, 20 instead of just WA, Kent, 20 – sushi Oct 24 '17 at 04:19
0

I understand, this solves it:

library(tibble)

data<-tribble(~state,  ~location,   ~pop,
       "WA",     "Seattle",    100,
       "WA",    "Kent",       20,
       "OR",     "foo" ,       30,
       "CA",     "foo2" ,      80

)

library(dplyr)

data%>%group_by(state)%>%summarise(location=location[which.min(pop)]
                                   ,min=min(pop))


# A tibble: 3 x 3
  state location   min
  <chr>    <chr> <dbl>
1    CA     foo2    80
2    OR      foo    30
3    WA     Kent    20
Rolando Tamayo
  • 286
  • 2
  • 8