11

I am trying to filter a column of a dataframe read from oracle as below

import org.apache.spark.sql.functions.{col, lit, when}

val df0  =  df_org.filter(col("fiscal_year").isNotNull())

When I do it I am getting below error:

java.lang.RuntimeException: Unsupported literal type class scala.runtime.BoxedUnit ()
at org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:77)
at org.apache.spark.sql.catalyst.expressions.Literal$$anonfun$create$2.apply(literals.scala:163)
at org.apache.spark.sql.catalyst.expressions.Literal$$anonfun$create$2.apply(literals.scala:163)
at scala.util.Try.getOrElse(Try.scala:79)
at org.apache.spark.sql.catalyst.expressions.Literal$.create(literals.scala:162)
at org.apache.spark.sql.functions$.typedLit(functions.scala:113)
at org.apache.spark.sql.functions$.lit(functions.scala:96)
at org.apache.spark.sql.Column.apply(Column.scala:212)
at com.snp.processors.BenchmarkModelValsProcessor2.process(BenchmarkModelValsProcessor2.scala:80)
at com.snp.utils.Utils$$anonfun$getAllDefinedProcessors$1.apply(Utils.scala:30)
at com.snp.utils.Utils$$anonfun$getAllDefinedProcessors$1.apply(Utils.scala:30)
at com.sp.MigrationDriver$$anonfun$main$6$$anonfun$apply$2.apply(MigrationDriver.scala:140)
at com.sp.MigrationDriver$$anonfun$main$6$$anonfun$apply$2.apply(MigrationDriver.scala:140)
at scala.Option.map(Option.scala:146)
at com.sp.MigrationDriver$$anonfun$main$6.apply(MigrationDriver.scala:138)
at com.sp.MigrationDriver$$anonfun$main$6.apply(MigrationDriver.scala:135)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.MapLike$DefaultKeySet.foreach(MapLike.scala:174)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at com.sp.MigrationDriver$.main(MigrationDriver.scala:135)
at com.sp.MigrationDriver.main(MigrationDriver.scala)

Any idea what am I doing wrong here and how to fix this?

hasumedic
  • 2,139
  • 12
  • 17
BdEngineer
  • 2,929
  • 4
  • 49
  • 85
  • please add more information about versions of Spark, and Spark Cassandra connector... – Alex Ott Nov 19 '18 at 13:19
  • @AlexOtt , sir Here are the version details : scala - 2.11 spark - 2.3.1 cassandra - 3.11.1 – BdEngineer Nov 19 '18 at 13:38
  • and spark-cassandra-connector version? – Alex Ott Nov 19 '18 at 13:58
  • what do you mean by "trying to filter a column of a dataframe"? can you elaborate that? – Ramesh Maharjan Nov 19 '18 at 15:08
  • @RameshMaharjan , column "fiscal_year" seems to have some null values , hence failing to load into cassandra ...so from dataframe filtering out those records. – BdEngineer Nov 19 '18 at 16:55
  • @AlexOtt sir, its spark-cassandra-connector_2.11 2.3.0 – BdEngineer Nov 19 '18 at 16:58
  • 1
    check this https://stackoverflow.com/questions/39727742/how-to-filter-out-a-null-value-from-spark-dataframe for filtering and you can check my answer too https://stackoverflow.com/questions/50478512/filter-null-value-in-dataframe-column-of-spark-scala – Ramesh Maharjan Nov 20 '18 at 04:22
  • @RameshMaharjan I am getting similar error while filtering ...how to fix it .... result_df.filter( col("indicator") === lit('N')) .... ERROR ::: RuntimeException: Unsupported literal type class java.lang.Character N – BdEngineer Dec 11 '18 at 13:14
  • isn't the error message clear enough? @user3252097 ? character is not supported in lit function – Ramesh Maharjan Dec 11 '18 at 15:21

2 Answers2

19

Just remove parenthesis on your function:

from:
val df0 = df_org.filter(col("fiscal_year").isNotNull())
to:
val df0 = df_org.filter(col("fiscal_year").isNotNull)

Mpizos Dimitris
  • 4,819
  • 12
  • 58
  • 100
16

Try removing () in isNull() in your filter.

Vlad
  • 8,225
  • 5
  • 33
  • 45
enoh
  • 186
  • 2
  • 3