I am trying to perform Scala
operation on Shark. I am creating an RDD as follows:
val tmp: shark.api.TableRDD = sc.sql2rdd("select duration from test")
I need it to convert it to RDD[Array[Double]]
. I tried toArray
, but it doesn't seem to work.
I also tried converting it to Array[String]
and then converting using map
as follows:
val tmp_2 = tmp.map(row => row.getString(0))
val tmp_3 = tmp_2.map { row =>
val features = Array[Double] (row(0))
}
But this gives me a Spark's RDD[Unit]
which cannot be used in the function. Is there any other way to proceed with this type conversion?
Edit I also tried using toDouble
, but this gives me an RDD[Double]
type, not RDD[Array[Double]]
val tmp_5 = tmp_2.map(_.toDouble)
Edit 2:
I managed to do this as follows:
A sample of the data:
296.98567000000003
230.84362999999999
212.89751000000001
914.02404000000001
305.55383
A Spark Table RDD was created first.
val tmp = sc.sql2rdd("select duration from test")
I made use of getString
to translate it to a RDD[String]
and then converted it to an RDD[Array[Double]]
.
val duration = tmp.map(row => Array[Double](row.getString(0).toDouble))