Actually MatrixUDT
has been a part of the o.a.s.mllib.linalg
since 1.4 and only recently has been copied to o.a.s.ml.linalg
. Since it's never been public you cannot even declare a correct schema so I seriously doubt it is intended for general applications. Not to mention that API is arguably to limited to be useful in practice.
Nevertheless basic conversions work just fine so all you need is a RDD or Seq
of product types (once again it is not possible to define schema) and you're good to go:
import org.apache.spark.ml.linalg.Matrices
Seq((1, Matrices.dense(2, 2, Array(1, 2, 3, 4)))).toDF
// org.apache.spark.sql.DataFrame = [_1: int, _2: matrix]
Seq((1, Matrices.dense(2, 2, Array(1, 2, 3, 4)))).toDS
// org.apache.spark.sql.Dataset[(Int, org.apache.spark.ml.linalg.Matrix)]
// = [_1: int, _2: matrix]