8

Consider

text <- "who let the dogs out"
fooo <- strsplit(text, " ")
fooo
[[1]]
[1] "who"  "let"  "the"  "dogs" "out" 

the output of strsplit is a list. The list's first element then is a vector, that contains the words above.

Why does the function behave that way? Is there any case in which it would return a list with more than one element?

And I can access the words using

fooo[[1]][1]
[1] "who"

, but is there no simpler way?

lawyeR
  • 7,488
  • 5
  • 33
  • 63
FooBar
  • 15,724
  • 19
  • 82
  • 171

1 Answers1

11

To your first question, one reason that comes to mind is so that it can keep different length result vectors in the same object, since it is vectorized over x:

text <- "who let the dogs out"
vtext <- c(text, "who let the")
##
> strsplit(text, " ")
[[1]]
[1] "who"  "let"  "the"  "dogs" "out" 

> strsplit(vtext, " ")
[[1]]
[1] "who"  "let"  "the"  "dogs" "out" 

[[2]]
[1] "who" "let" "the"

If this were to be returned as a data.frame, matrix, etc... instead of a list, it would have to be padded with additional elements.

nrussell
  • 18,382
  • 4
  • 47
  • 60
  • 3
    Right, I thought about a vector in `y`, not in `x`. Great. But I'm so tempted to change your `vtext` to `...c(text, 'who who who') – FooBar Nov 28 '14 at 21:20
  • Also, there are functions in other packages such as `stringr` and `stringi` that have the capability of returning something other than a `list`, such as a character matrix (provided the resulting vectors are the same length, presumably). I haven't had a chance to spend much time using `stringi` yet, but it seems to have several string splitting functions that would be potentially useful to you (check the [see also section here](http://docs.rexamine.com/R-man/stringi/stri_split.html)) – nrussell Nov 28 '14 at 21:26
  • 1
    `stringi::stri_list2matrix` is awesome – Rich Scriven Nov 28 '14 at 21:43