0

That title might be confusing so it might be easier to show what I'm trying to do. This is what the first data frame looks like:

city code population
dallas NA 18
chicago NA 24
new york NA 13
dallas NA 90

And the second data frame looks like this:

city code feet
dallas 1 1000
austin 2 1300
cupertino 3 908
chicago 4 2132
las vegas 5 1128
new york 6 1133

Ultimately I want to input the code values from the second data frame into the first data frame based on the cities, so the first data frame should ultimately look like this:

city code population
dallas 1 18
chicago 4 24
new york 6 13

How do I do that without for loops? I know how to do it with for loops, but ultimately that is really inefficient and I'm looking for an efficient way that is a line or two long. For further clarification this is not my actual data but I figured it would be the simplest way to convey my issue.

  • 2
    Try `merge(df1[-2], df2)[c(1, 3, 2)]`. In `df1` you have two rows for `dallas`, but only one in the expected result. What is the criterion to choose it? – Rui Barradas Aug 13 '22 at 06:34
  • I want to do it automatically, so df1 will find when the city has a match in df2 then input the code from df2 into df1. df2 should just be used as a lookup table for df1. – Nithin Reddy Aug 13 '22 at 07:06

1 Answers1

0

This will give exactly your desired output:

library(dplyr)
library(tidyr)
full_join(df1 %>%
            mutate(id = row_number()), df2) %>% 
  group_by(city, code, population, id) %>% 
  summarise() %>% 
  group_by(city) %>% 
  fill(code, .direction="down") %>% 
  slice(2) %>% 
  ungroup() %>% 
  arrange(id) %>% 
  select(-id)
  city      code population
  <chr>    <int>      <int>
1 dallas       1         18
2 chicago      4         24
3 new york     6         13
TarJae
  • 72,363
  • 6
  • 19
  • 66