How can I select rows from one matrix which don't match the rows from another matrix. The case is that I want to train a model over a sample of my data and validate over the other part of the data. Thanks in advance.
Asked
Active
Viewed 105 times
0
-
One option is convert the matrices to data.frame and use `?anti_join` from `library(dplyr)` – akrun Apr 03 '15 at 20:53
-
If you're creating the initial sample of rows, then you can simply use that to isolate two mutually exclusive matrices in the first place and avoid this problem entirely. – Thomas Apr 03 '15 at 20:58
-
Or add a column to your data that is either "test" or "train" (or 1 or 0) and just feed subsets to your model. – Gregor Thomas Apr 03 '15 at 21:25
1 Answers
1
You can use indexing for that (as hinted by Thomas). Say you have a 2000 rows matrix and want to randomly select half of it:
# Create the matrix
my.matrix <- matrix(rnorm(4000),nrow = 2000)
# Create a vector of 1000 row numbers
selection <- sample(1:2000, size = 1000)
# Create the 2 mutually exclusive matrices
matrix.1 <- my.matrix[selection,]
matrix.2 <- my.matrix[-selection,]

Dominic Comtois
- 10,230
- 1
- 39
- 61