0

colsplit in package reshape2 can be used to split character data:

colsplit(c("A_1", "A_2", "A_3"), pattern="_", c("Letter", "Number"))
  Letter Number
1      A      1
2      A      2
3      A      3

In his paper "Rehaping data with the Reshape Package", Hadley Wickham gives an example of using colsplit to split data into individual characters. His example should produce the above from the data c("A1", "A2", "A3") which he does by omitting the pattern argument. But this throws an error.

The documentation for str_split_fixed which colsplit calls says that setting pattern="" will split into individual characters, but this does not work.

Is there any way to use colsplit so that it splits into individual character.

This is R 3.1.1 and packages are up to date.

A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
peter2108
  • 5,580
  • 6
  • 24
  • 18

2 Answers2

2

The problem is that you are referring to an article about "reshape" but are using "reshape2". The two are not the same and they don't work the same:

library(reshape)
library(reshape2)

reshape:::colsplit(c("A1", "A2", "A3"), "", c("V1", "V2"))
#   V1 V2
# 1  A  1
# 2  A  2
# 3  A  3
reshape2:::colsplit(c("A1", "A2", "A3"), "", c("V1", "V2"))
#   V1 V2
# 1 NA A1
# 2 NA A2
# 3 NA A3

If you don't have to go the colsplit way, there are other options:

do.call(rbind, strsplit(c("A1", "A2", "A3"), "", fixed = TRUE))
#      [,1] [,2]
# [1,] "A"  "1" 
# [2,] "A"  "2" 
# [3,] "A"  "3"

Or, a more general approach (for example characters followed by numbers, not necessarily one character each):

do.call(rbind, strsplit(c("A1", "A2", "A3"), 
                        split = "(?<=[a-zA-Z])(?=[0-9])", 
                        perl = TRUE))
#      [,1] [,2]
# [1,] "A"  "1" 
# [2,] "A"  "2" 
# [3,] "A"  "3"
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
1

Using qdap:

library(qdap)
colSplit(c("A1", "A2", "A3"), "")

##   X1 X2
## 1  A  1
## 2  A  2
## 3  A  3
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519