I'm new to R (and to stackoverflow) and I would appreciate your help. I would like to count the number of occurences of each unique column in a matrix. I have written the following code, but it is extremely slow :
frequencyofequalcolumnsinmatrix = function(matrixM){
# returns a matrix columnswithfrequencyofmtxM that contains each distinct column and the frequency of each distinct columns on the last row. Hence if the last row is c(3,5,3,2), then matrixM has 3+5+3+2=13 columns; there are 4 distinct columns; and the first distinct column appears 3 times, the second distinct column appears 5 times, etc.
n = nrow(matrixM)
columnswithfrequencyofmtxM = c()
while (ncol(matrixM)>0){
indexzero = which(apply(matrixM-matrixM[,1], 2, function(x) identical(as.vector(x),rep(0,n))));
indexnotzero = setdiff(seq(1:ncol(matrixM)),indexzero);
frequencyofgivencolumn = c(matrixM[,1], length(indexzero)); #vector of length n. Coordinates 1 to nrow(matrixM) contains the coordinates of the given distinct column while coordinate nrow(matrixM)+1 contains the frequency of appearance of that column
columnswithfrequencyofmtxM = cbind(columnswithfrequencyofmtxM,frequencyofgivencolumn, deparse.level=0);
matrixM=matrixM[,indexnotzero];
matrixM = as.matrix(matrixM);
}
return(columnswithfrequencyofmtxM)
}
If we apply on the matrix 'testmtx', we obtain:
> testmtx = matrix(c(1,2,4,0,1,1,1,2,1,1,2,4,0,1,1,0,1,1), nrow=3, ncol=6)
> frequencyofequalcolumnsinmatrix(testmtx)
[,1] [,2] [,3]
[1,] 1 0 1
[2,] 2 1 2
[3,] 4 1 1
[4,] 2 3 1
where the last row contains the number of occurrences of the column above.
Unhappy with my code, I browsed through stackoverflow. I found the following Question:
Fastest way to count occurrences of each unique element
It is shown that the fastest way to count occurrences of each unique element of a vector is through the use of the data.table() package. Here is the code:
f6 <- function(x){
data.table(x)[, .N, keyby = x]
}
When we run it we obtain:
> vtr = c(1,2,3,1,1,2,4,2,4)
> f6(vtr)
x N
1: 1 3
2: 2 3
3: 3 1
4: 4 2
I have tried to modify this code in order to use it in my case. This requires to be able to create vtr as a vector in which each element is a vector. But I haven't been able to do that.(Most likely because in R, c(c(1,2),c(3,4)) is the same as c(1,2,3,4)).
Should I try to modify the function f6? If so, how?
Or should I take a completely different approach? IF so, which one?
Thank you!