I have a dataframe :
val DF = {spark.read.option("header", value = true).option("delimiter", ";").csv(path_file)}
val cord = DF.select("time","longitude", "latitude","speed")
I want to calculate z score (x-mean)/std of each row of speed column.I calculate the mean and standard deviation :
val std = DF.select(col("speed").cast("double")).as[Double].rdd.stdev()
val mean = DF.select(col("speed").cast("double")).as[Double].rdd.mean()
How to calculate z score for each row of column speed and obtain this result :
+----------------+----------------+-
|A |B |speed | z score
+----------------+----------------+---------------------+
|17/02/2020 00:06| -7.1732833| 50 | z score
|17/02/2020 00:16| -7.1732833| 40 | z score
|17/02/2020 00:26| -7.1732833| 30 | z score
How to do for calcule it for each row.