0

What is the idiomatic way to do the following string concatenation in R?

Given two vectors of strings, such as the following,

titles <- c("A", "B")
sub.titles <- c("x", "y", "z")

I want to produce the vector

full.titles <- c("A_x", "A_y", "A_z", "B_x", "B_y", "B_z")

Obviously, this could be done with two for-loops. However, I would like to know what an “idiomatic” (i.e., elegant and natural) solution would be in R.

In Python, an idiomatic solution might look like this:

titles = ['A', 'B']
subtitles = ['x', 'y', 'z']
full_titles = ['_'.join([title, subtitle])
               for title in titles for subtitle in subtitles]

Does R allow for a similar degree of expressiveness?

Remark

The consensus among the solutions proposed thus far is that the idiomatic way to do this in R is, basically,

full.titles <- c(t(outer(titles, sub.titles, paste, sep = "_")))

Interestingly, this has an (almost) literal translation in Python:

full_titles = map('_'.join, product(titles, subtitles))

where product is the cartesian-product function from the itertools module. However, in Python, such a use of map is considered more convoluted—that is, less expressive—than the equivalent use of list comprehension, as above.

egnha
  • 1,157
  • 14
  • 22
  • 1
    @brittenb that does not produce the vector as required in the question. – zacdav Apr 19 '16 at 11:23
  • @zacdav Yeah, I'm looking at the output now and am confused as to why it's not producing the expected output. I'll delete the comment. – tblznbits Apr 19 '16 at 11:24
  • Somewhat direct translation: mapply(function(x,y) sprintf("%s_%s", x, y), rep(titles, each=length(subtitles)), subtitles) – chinsoon12 Apr 19 '16 at 11:43
  • R is more "colorful" than Python in this example...wonder how many diff ways everyone can think of... – chinsoon12 Apr 19 '16 at 11:45
  • 1
    http://stackoverflow.com/questions/16143700/pasting-two-vectors-with-combinations-of-all-vectors-elements – Scott Warchal Apr 19 '16 at 11:46
  • for the order, just add sort. – Colonel Beauvel Apr 19 '16 at 12:16
  • @Colonel Beauvel: Or just transpose any of the solutions proposed below. :) – egnha Apr 19 '16 at 12:20
  • @chinsoon12: I would say R is more “convoluted“ than Python. ;) (But also more concise, when it comes to working with tabular data.) I updated the question with a remark showing how to be “colorful” (but not necessarily more readable nor more efficient) in Python. – egnha Apr 20 '16 at 07:46

6 Answers6

5

There are a couple of ways to go about this, either using the 'outer()' function to define your function as the matrix product of two vectors, along the lines of:

outer(titles, sub.titles, paste, sep='_')

and then wrangling it from a matrix into a vector, or converting your input to dataframe, using expand.grid()

do.call(paste, expand.grid(titles, sub.titles, sep='_', stringsAsFactors=FALSE))

Miff
  • 7,486
  • 20
  • 20
  • 1
    You can probably wrap it into `c` as in `c(outer(titles, sub.titles, paste, sep='_'))` – David Arenburg Apr 19 '16 at 11:49
  • 1
    Elegant. This almost produces the correct output. Unfortunately, the resulting components are in the wrong order. (I've updated the question to clarify this.) A transpose fixes this: `c(t(outer(...)))` – egnha Apr 19 '16 at 12:15
3

Using do.call combined with paste and expand.grid

sort(do.call(paste, c(sep='_', expand.grid(titles, sub.titles))))
#[1] "A_x" "A_y" "A_z" "B_x" "B_y" "B_z"

Or using tidyr::unite combined with expand.grid

unite(expand.grid(titles, sub.titles), Res, everything()) %>% .$Res
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
Colonel Beauvel
  • 30,423
  • 11
  • 47
  • 87
2
apply(expand.grid(titles, sub.titles), 1, paste, collapse = "_")

expand.grid creates a matrix of combinations between titles and sub.titles.
apply goes down the matrix of combinations and pastes them together.

Scott Warchal
  • 1,028
  • 10
  • 15
1

Try this code:

unlist(lapply(1:length(titles), function(x){paste(titles[x], sub.titles, sep="_")}))

J_F
  • 9,956
  • 2
  • 31
  • 55
1

This code also works: as.vector(outer(titles, subtitles, FUN=paste, sep="_"))

outer essentially performs a function element-wise to each element from each vector. So it'll take each element from titles and perform a function with each element from subtitles. The default function is multiplication, but we change that default by passing a new argument to the FUN parameter. Arguments that are used in our new function are appended after a comma. So we're telling R to take the first element from titles and paste it together with each element from subtitles and separate the two elements with a "_". Then do it again with the second element from titles.

tblznbits
  • 6,602
  • 6
  • 36
  • 66
1
full.titles  <-  paste0(expand.grid(titles,sub.titles)$Var1,'_',
expand.grid(titles,sub.titles)$Var2)
>full.titles
[1] "A_x" "B_x" "A_y" "B_y" "A_z" "B_z"
Zahiro Mor
  • 1,708
  • 1
  • 16
  • 30