I have a matrix with n rows of observations. Observations are frequency distributions of the features. I would like to transform the frequency distributions to probability distributions where the sum of each row is 1. Therefore each element in the matrix should be divided by the sum of the row of the element.
I wrote the following R function that does the work but it is very slow with large matrices:
prob_dist <- function(x) {
row_prob_dist <- function(row) {
return (t(lapply(row, function(x,y=sum(row)) x/y)))
}
for (i in 1:nrow(x)) {
if (i==1) p_dist <- row_prob_dist(x[i,])
else p_dist <- rbind(p_dist, row_prob_dist(x[i,]))
}
return(p_dist)
}
B = matrix(c(2, 4, 3, 1, 5, 7), nrow=3, ncol=2)
B
[,1] [,2]
[1,] 2 1
[2,] 4 5
[3,] 3 7
prob_dist(B)
[,1] [,2]
[1,] 0.6666667 0.3333333
[2,] 0.4444444 0.5555556
[3,] 0.3 0.7
Could you suggest R function that does the job and/or tell me how can I optimise my function to perform faster?