Replacing multiple columns from different dataframe using dplyr

Question

I have two dataframes, one of which contains a subset of IDs and columns of the other (but has different values).

ds1 <- data.frame(id = c(1:4),
                      d1 = "A",
                      d2 = "B",
                      d3 = "C")


ds2 <- data.frame(id = c(1,2),
                     d1 = "W",
                     d2 = "X")

I am hoping to use dplyr on d1 to find the shared columns, and replace their values with those found in d2, matching on ID. I can mutate them one at a time like this:

ds1 %>% 
  mutate(d1 = ifelse(id %in% ds2$id, ds2$d1[ds2$id==id],d1),
         d2 = ifelse(id %in% ds2$id, ds2$d2[ds2$id==id],d2))

In my real situation, I am needing to do this 47 times, however. With the robustness of across(), I feel there is a better way. I am open to non-dplyr solutions as well.

AnilGoyal · Answer 1 · 2021-05-10T15:41:26.747

You may perhaps need this using dplyr and stringr (can be done without stringr also)


library(tidyverse)

ds1 %>% left_join(ds2, by = 'id') %>%
  mutate(across(ends_with('.y'), ~ coalesce(., get(str_replace(cur_column(), '.y', '.x'))))) %>%
  select(!ends_with('.x')) %>%
  rename_with(~str_remove(., '.y'), ends_with('.y'))

#>   id d3 d1 d2
#> 1  1  C  W  X
#> 2  2  C  W  X
#> 3  3  C  A  B
#> 4  4  C  A  B

^{Created on 2021-05-10 by the reprex package (v2.0.0)}

score 2 · Answer 2 · answered May 10 '21 at 16:16

This is somewhat similar to the one posted by my friend dear @AnilGoyal and also a little bit verbose comparing to yours you can use it for larger data sets:

library(dplyr)
library(stringr)


ds1 %>%
  left_join(ds2, by = "id") %>% 
  mutate(across(ends_with(".x"), ~ ifelse(!is.na(get(str_replace(cur_column(), ".x", ".y"))), 
                                          get(str_replace(cur_column(), ".x", ".y")), 
                                          .x))) %>%
  select(!ends_with(".y")) %>%
  rename_with(~ str_remove(., ".x"), ends_with(".x"))


  id d1 d2 d3
1  1  W  X  C
2  2  W  X  C
3  3  A  B  C
4  4  A  B  C

score 2 · Accepted Answer · answered May 11 '21 at 10:32

using rows_update

library(tidyverse)
ds1 <- data.frame(id = c(1:4),
                  d1 = "A",
                  d2 = "B",
                  d3 = "C")


ds2 <- data.frame(id = c(1,2),
                  d1 = "W",
                  d2 = "X")

rows_update(x = ds1, y = ds2, by = "id")
#>   id d1 d2 d3
#> 1  1  W  X  C
#> 2  2  W  X  C
#> 3  3  A  B  C
#> 4  4  A  B  C

^{Created on 2021-05-11 by the reprex package (v2.0.0)}

Replacing multiple columns from different dataframe using dplyr

3 Answers3

Related