5

I want to split a matrix into two parts. I used following code

x <- matrix(rnorm(15),5,3)
idx <- rbinom(5,1,0.5)
split(x,idx)

However, I got two vector instead two matrix. I know if I convert x to data.frame will get what I want. i.e.

x <- as.data.frame(matrix(rnorm(15),5,3))
idx <- rbinom(5,1,0.5)
split(x,idx)

I wonder is there any way without convert matrix into data frame and result still in the matrix format? And why this happened?

David Lee
  • 107
  • 2
  • 7
  • relevant: [What is the algorithm behind R core's `split` function?](https://stackoverflow.com/q/52158589/4891738) – Zheyuan Li Sep 04 '18 at 12:56

2 Answers2

2

split.data.frame(x,idx) maybe? That will force the split operation to treat your matrix like a data.frame, instead of as a vector with dimensions (which essentially describes a matrix).

Example showing it gives essentially the same result, but with a matrix instead of data.frame returned:

set.seed(1)
x <- matrix(rnorm(15),5,3)
idx <- rbinom(5,1,0.5)
split.data.frame(x,idx)
#$`0`
#           [,1]       [,2]       [,3]
#[1,] -0.6264538 -0.8204684  1.5117812
#[2,] -0.8356286  0.7383247 -0.6212406
#[3,]  1.5952808  0.5757814 -2.2146999
#
#$`1`
#          [,1]       [,2]      [,3]
#[1,] 0.1836433  0.4874291 0.3898432
#[2,] 0.3295078 -0.3053884 1.1249309

split(data.frame(x),idx)
#$`0`
#          X1         X2         X3
#1 -0.6264538 -0.8204684  1.5117812
#3 -0.8356286  0.7383247 -0.6212406
#4  1.5952808  0.5757814 -2.2146999
#
#$`1`
#         X1         X2        X3
#2 0.1836433  0.4874291 0.3898432
#5 0.3295078 -0.3053884 1.1249309
thelatemail
  • 91,185
  • 12
  • 128
  • 188
  • Thanks for your solution. It is my fault that did not make my purpose clear. I still want to the split result that stored as matrix due to efficient consideration. split.data.frame() will still made the result in each list as data.frame. This is what I do not want to get. – David Lee Oct 31 '16 at 02:39
  • @DavidLee - `sapply(split.data.frame(x,idx), is.matrix)` returns `TRUE, TRUE` - the results are both matrices. There is also no internal conversion to `data.frame` at any stage from what I can determine by looking at the source code. – thelatemail Oct 31 '16 at 02:41
  • Hi, @thelatemail, it is my fault. You are correct, split.data.frame() works for me. – David Lee Oct 31 '16 at 02:49
1

You can do this via row subsetting. Using your provided x and idx:

split_x <- list(x_a= x[idx == 1,], 
                x_b= x[idx != 1,])
alexwhitworth
  • 4,839
  • 5
  • 32
  • 59