Do have a question in Scala.
val spark = SparkSession.builder
.master("local[1]")
.appName("SparkByExamples.com")
.getOrCreate()
val data = Seq(("James","Smith","USA","CA"),
("Michael","Rose","USA","NY"),
("Robert","Williams","USA","CA"),
("Maria","Jones","USA","FL")
)
val columns = Seq("firstname","lastname","country","state")
import spark.implicits._
val df = data.toDF(columns:_*)
df.show(false)
firstname | lastname | country | state |
---|---|---|---|
James | Smith | USA | CA |
Michael | Rose | USA | NY |
Robert | Williams | USA | CA |
Maria | Jones | USA | FL |
I would like to filter this something like this using state to filter with CA and FL only. any idea? What I'm looking for is something filter with Seq any possible? THanks
df.filter("state" === Seq("CA", "FL"))
firstname | lastname | country | state |
---|---|---|---|
James | Smith | USA | CA |
Robert | Williams | USA | CA |
Maria | Jones | USA | FL |