1

I have the following dataframe, where I hope to compare the value in a column to a string, and use it as a condition in if-else statement.

dataframe:

id   type       name
1    fruit     apple
2    toy       football
3   fruit      orange 

what I am hoping to achieve is:

if(df("type") == "fruit"){
    //do something to the df
}else if ( df("type") == "toy"){
    //something else to the df
}

I tried to use val type= df.select("type").collectAsList().getString(0) but that's not correct. Could someone please help? Many thanks. I don't think it's a duplicate to this question as I do not want to add a new column. Spark: Add column to dataframe conditionally and I do not wish to use withColumn

user3735871
  • 527
  • 2
  • 14
  • 31
  • df("type") is a column... you can't compare it to a string. e.g. if you compare it to "fruit" should it be true or false? Rows 1 and 3 returns true while row 2 returns false. – mck Mar 26 '21 at 07:06
  • the question is what is it you want to do with the df in that if else statements. – Matt Mar 26 '21 at 09:47
  • Anything you are trying to do in an if..else statements on df is happening on entire df and not to the rows only having fruit and toy unless you filter the df for these rows to a new df. – Mohd Avais Mar 26 '21 at 12:28

1 Answers1

1

Implicit class for Dataframe should do the trick, code below (Ignore my crazy imports)

import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.Column
import org.apache.spark.sql.functions._
import spark.implicits._

implicit class typeTransform(df: DataFrame) {
    def tranformByType(typeCol: Column) = {
      if(df.filter(typeCol === lit("fruit")).count > 0) df // do something to df 
      //........ more if Statements
      else df// do something with df 
    }
}

Usage can be something like this

val someDf = Seq(("fruit", "1"), ("toy", "2")).toDF("type", "id").tranformByType(col("type"))
Sathvik
  • 25
  • 1
  • 5