Using UDF implies that each factor c1
, c2
, c3
must be passed by parameter independently. Is there any flexible solution, e.g. how to pass a sequence of these factors to UDF?
val myFunction = udf {
(userBias: Float, productBias: Float, productBiases: Map[Long, Float],
userFactors: Seq[Float], productFactors: Seq[Float], c1: String, c2: String, c3: String) =>
var result = Float.NaN
// result calculation
result
}
And then I call this function the following way (dataset
is a DataFrame
):
myFunction(userBias("bias"),
productBias("bias"),
productBias("biases"),
userFactors("features"),
productFactors("features"),
dataset(factors(0)), dataset(factors(1)), dataset(factors(2))
If I do something like this, then the compiler says "Not applicable":
val myFactors = dataset.select(factors.head, factors.tail: _*)
myFunction(userBias("bias"),
productBias("bias"),
productBias("biases"),
userFactors("features"),
productFactors("features"),
myFactors)