1

How do I create a new column from columns whose names are contained in a character vector?

Given these two variables:

data <- tibble(numbers = 1:10, letters = letters[1:10])
columns <- c("numbers","letters")

What command would produce this output?

# A tibble: 10 x 3
   numbers letters combined
     <int> <chr>   <chr>   
 1       1 a       1-a     
 2       2 b       2-b     
 3       3 c       3-c     
 4       4 d       4-d     
 5       5 e       5-e     
 6       6 f       6-f     
 7       7 g       7-g     
 8       8 h       8-h     
 9       9 i       9-i     
10      10 j       10-j  

My first thought was mutate(data, combined = paste(!!columns, sep = "-")) but this does not work.

Error: Problem with `mutate()` input `combined`.
x Input `combined` can't be recycled to size 10.
ℹ Input `combined` is `paste(c("numbers", "letters"))`.
ℹ Input `combined` must be size 10 or 1, not 2.
AndrewGB
  • 16,126
  • 5
  • 18
  • 49
Michael Henry
  • 599
  • 2
  • 4
  • 17

3 Answers3

2

not the prettiest but this should work

do.call(
  paste,
  c(data[, columns], list(sep = '-'))
)
MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
  • This worked: data$combined <- do.call(paste, c(data[, columns], list(sep = '-'))). Call me a snob, but I might wait to see if there is a more tidy solution ;) – Michael Henry Feb 11 '22 at 05:14
  • 1
    @MichaelHenry the key is the `do.call` part which outputs a column vector. You could throw that into `mutate()` if you'd like: `data <- mutate(data, combined = do.call( ... ))` – MichaelChirico Feb 11 '22 at 06:05
  • 1
    @MichaelHenry FYI, `tidyr::unite()` is basically doing my approach "under the hood": `print(tidyr:::unite.data.frame)`. There we can also notice a subtle but maybe important detail -- `tidyr::unite()` skips `NA` values. Make sure that's your desired behavior. – MichaelChirico Feb 11 '22 at 08:38
2

A tidyverse approach is to use unite, where you can pass the columns vector directly into the function without having to use !!.

library(tidyverse)

data %>% 
  tidyr::unite("combined", columns, sep = "-", remove = FALSE) %>% 
  dplyr::relocate(combined, .after = tidyselect::last_col())

Output

   numbers letters combined
     <int> <chr>   <chr>   
 1       1 a       1-a     
 2       2 b       2-b     
 3       3 c       3-c     
 4       4 d       4-d     
 5       5 e       5-e     
 6       6 f       6-f     
 7       7 g       7-g     
 8       8 h       8-h     
 9       9 i       9-i     
10      10 j       10-j 
MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
AndrewGB
  • 16,126
  • 5
  • 18
  • 49
  • The order of columns wasn't important but thank you for providing such a precise answer! It might be because I'm using an older version of tidyr but I got a warning to wrap `columns` in `all_of()`. Final solution: `unite(data, combined, all_of(columns), sep = "-", remove = FALSE)` – Michael Henry Feb 11 '22 at 05:27
  • @MichaelHenry I wasn't sure if you wanted it in the original order, so I just always include it. Interesting; I am running version `1.3.1` of `tidyverse`. But at least they usually have helpful suggestions in the error/warning messages! – AndrewGB Feb 11 '22 at 06:37
0
 cbind( data , combined = paste( data[[ columns[1] ]], data[[ columns[2] ]], sep=“-“)
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Unfortunately the number of columns is variable so a solution that assumes two columns won't work. The `unite()` solution above works. Thank you! – Michael Henry Feb 11 '22 at 06:04