I'm new to Spark Scala. I would really appreciate if someone could help me here. I have a dataframe called df.
df.printSchema()
root
|-- tab: string (nullable = true)
|-- cust: string (nullable = true)
|-- date: string (nullable = true)
|-- uniqIds: string (nullable = true)
df.show()
+-------+----+----------+--------------------+
| tab|cust| date| uniqIds|
+-------+----+----------+--------------------+
|t_users| abc|2018050918|[123, 1234, 22123] |
|t_users| def|2018050918|[1sdf23, 12f34] |
+-------+----+----------+--------------------+
Now I want to loop through each record and do some processing, that is kickoff another function/process based off of first 3 columns. If that process is success. Then I want to store all the values from uniqIds column into a df. Once I have all the uniqIds for the success process, I will write them to a file.
var uq = Seq((lit(e))).toDF("unique_id)
df.foreach { row =>
val uniqIds: Array[String] = row(3).toString.replace("[", "").replace("]", "").replace(" ","").split(",")
uniqIds.foreach { e=>
var df2 = Seq((lit(e))).toDF("unique_id)
uq.union(df2)
}
But when I try doing that, I get an error message
ERROR Executor:91 - Exception in task 1.0 in stage 11.0 (TID 23) java.lang.NullPointerException
Does anyone have the same problem ? How can I overcome this problem. Thanks in advance.