I've been trying to understand how to deal with the output of strsplit
a bit better. I often have data such as this that I wish to split:
mydata <- c("144/4/5", "154/2", "146/3/5", "142", "143/4", "DNB", "90")
#[1] "144/4/5" "154/2" "146/3/5" "142" "143/4" "DNB" "90"
After splitting that the results are as follows:
strsplit(mydata, "/")
#[[1]]
#[1] "144" "4" "5"
#[[2]]
#[1] "154" "2"
#[[3]]
#[1] "146" "3" "5"
#[[4]]
#[1] "142"
#[[5]]
#[1] "143" "4"
#[[6]]
#[1] "DNB"
#[[7]]
#[1] "90"
I know from the strsplit help guide that final empty strings are not produced. Therefore, there will be 1, 2 or 3 elements in each of my results based on the number of "/" to split by
Getting the first element is very trivial:
sapply(strsplit(mydata, "/"), "[[", 1)
#[1] "144" "154" "146" "142" "143" "DNB" "90"
But I am not sure how to get the 2nd, 3rd... when there are these unequal number of elements in each result.
sapply(strsplit(mydata, "/"), "[[", 2)
# Error in FUN(X[[4L]], ...) : subscript out of bounds
I would hope to return from a working solution, the following:
#[1] "4" "2" "3" "NA" "4" "NA" "NA"
This is a relatively small example. I could do some for loop very easily on these data, but for real data with 1000s of observations to run the strsplit on and dozens of elements produced from that, I was hoping to find a more generalizable solution.