Dears, I'm New on SparK Scala, and, I have a DF of two columns: "UG" and "Counts" and I like to obtain the Third How was exposed in thsi list.
DF: UG, Counts, CUG ( the columns)
- of 12 4
- of 23 4
- the 134 3
- love 68 2
- pain 3 1
- the 18 3
- love 100 2
- of 23 4
- the 12 3
- of 11 4
I need to add a new column called "CUG", the third one exposed, where CUG(i) is the number of times that the string(i) in UG appears in the whole Column.
I tried with the following scheme:
Having the DF like the previous table in df. I did a sql UDF function to count the number of times that the string appear in the column "UG", that is:
val NW1 = (w1:String) => {
df.filter($"UG".like(w1.substring(1,(w1.length-1))).count()
}:Long
val sqlfunc = udf(NW1)
val df2= df.withColumn("CUG",sqlfunc(col("UG")))
But when I tried, ...It did'nt work. I obtained an error of Null Point exception. The UDF scheme worked isolated but not with in DF. What can I do in order to obtain the asked results using DF.
Thanks In advance. jm3