In order to use Spark's machine learning capabilities I converted my training data to Spark vectors (DenseVector
or SparseVector
). I have to do some arithmetic (addition, multiplication with scalar, dot product) on that data before I can feed it into Spark's fit
function.
Spark's own vector classes don't seem to offer any arithmetic functions.
Spark allows converting its own vectors to breeze (scala numerical processing library) which has all the bells and whistles but it doesn't allow breeze vectors to be converted to Spark vectors.
Are there functions for doing arithmetic with Spark's vectors or is there an easy/efficient way to convert breeze vectors to Spark's vectors?
update:
There's also a vector implementation in org.apache.spark.util which does support arithmetic but which seems to be completely disconnected from the implementation in org.apache.spark.mllib.linalg which I'm interested in.