0

Do have a question in Scala.

val spark = SparkSession.builder
    .master("local[1]")
    .appName("SparkByExamples.com")
    .getOrCreate()

val data = Seq(("James","Smith","USA","CA"),
  ("Michael","Rose","USA","NY"),
  ("Robert","Williams","USA","CA"),
  ("Maria","Jones","USA","FL")
  )

val columns = Seq("firstname","lastname","country","state")
import spark.implicits._
val df = data.toDF(columns:_*)

df.show(false)
firstname lastname country state
James Smith USA CA
Michael Rose USA NY
Robert Williams USA CA
Maria Jones USA FL

I would like to filter this something like this using state to filter with CA and FL only. any idea? What I'm looking for is something filter with Seq any possible? THanks

df.filter("state" === Seq("CA", "FL"))
firstname lastname country state
James Smith USA CA
Robert Williams USA CA
Maria Jones USA FL
KKK
  • 27
  • 7

1 Answers1

0

Got it,

val selected = Seq("CA","NY")
df.filter($"state".isin(selected:_*)).show
KKK
  • 27
  • 7