1

Here's a fun one. I'm trying to do exactly what this post is doing. That is, repeating and grouping words.

The catch with this question is that I'd like to do it purely with stringr's word() function with a paste0 wrapper. Take the following sentence

sentence <- "Jane saw a cat and then Jane sat down."

The exact result would be

[1] "Jane saw, saw a, a cat, cat and, and then, then Jane, Jane sat, sat down."

I've gotten this far, but word() leaves an extra "" at the end of this string, likely due to the way I've written my code in word() because it doesn't otherwise leave an empty string.

> library(stringr)
> len <- length(strsplit(sentence, " ")[[1]])
> paste0(word(sentence, c(1, 2:len), c(2, 3:len)), collapse = ", ")
[1] "Jane saw, saw a, a cat, cat and, and then, then Jane, Jane sat, sat down., "

Can this be done without the trailing ", " using only the word() function?

Community
  • 1
  • 1
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
  • possible duplicate of [String transformation in R | Grouping words of a string](http://stackoverflow.com/questions/25441766/string-transformation-in-r-grouping-words-of-a-string) – bartektartanus Aug 23 '14 at 23:07
  • 1
    `paste0(word(sentence, -c(len:2), -c((len-1):1)), collapse = ', ')`? – jdharrison Aug 23 '14 at 23:08
  • @bartektartanus Are you serious? I linked that post in this question. – Rich Scriven Aug 23 '14 at 23:08
  • paste0(word(sentence, c(2:len-1), c(2:len)), collapse = ", ") – bartektartanus Aug 23 '14 at 23:20
  • 1
    I'll leave that here just in case you agree. The fact you are already using `strsplit` to parse the words and get their count makes the use of `word()` a little redundant. You could just stick to base R and do: `words <- unlist(strsplit(sentence, ' ')); paste(head(words, -1), tail(words, -1), collapse = ", ")`. – flodel Aug 23 '14 at 23:31
  • hmm, yes...very good point :-). I'm really trying to get a better feel for `word()` as it seems like it can be useful in many situations. – Rich Scriven Aug 23 '14 at 23:43

1 Answers1

2

I think your start and end arguments to word need to be the same length (otherwise recyling occurs) so

paste0(word(sentence, c(1:(len-1)), c(2:len)), collapse = ", ")

or

paste0(word(sentence, -c(len:2), -c((len-1):1)), collapse = ', ')

would do the trick

jdharrison
  • 30,085
  • 4
  • 77
  • 89
  • Thanks. That can be a pretty handy function in string manipulation. I just didn't think to reverse the lengths. Cheers! – Rich Scriven Aug 23 '14 at 23:23