I am using method computeSVD
from Spark class IndexedRowMatrix
(in Scala). I have noticed it has no setSeed()
method. I am getting slightly different results for multiple runs on the same input matrix, possibly due to the internal algorithm used by Spark. Although it also implements an approximate scalable SVD algorithm, I would say from the source code that computeSVD()
from IndexedRowMatrix
does not apply the approximate but the exact version.
Since I am doing recommendations with the SVD results, and the user and item latent factors matrices are different, I am actually getting different recommendation lists: in some runs roughly the same items in different order, sometimes a few new items get into the list and some are missing, because the predicted ratings are often almost tied after doing imputation on the missing input ratings matrix that is passed to computeSVD()
.
Has anyone else had this problem? Is there a way to make this fully deterministic, or I am missing something?
Thanks