2

I am new in Spark programming. I am trying to extract values from RDD as I got the below output from RDD

(CBI10006,(Some(Himanshu Vasani),None))
(CBI10004,(Some(Sonam Petro),Some(8500)))
(CBI10003,(None,Some(3000)))

And I want to extract above value to below one

(CBI10006,Himanshu Vasani,'')
(CBI10004,Sonam Petro,8500)
(CBI10003,'',3000)

And I have tried FlatMap approch as below

joined.flatMap{case(f1,f2) => (f1,(f2._1,f2._2))} but getting a below error

type mismatch;
 found   : (String, (Option[String], Option[String]))
 required: TraversableOnce[?]
    **joined.flatMap{case(f1,f2) => (f1,(f2._1,f2._2))}**
Gabio
  • 9,126
  • 3
  • 12
  • 32

1 Answers1

2

Using map():

val data = Seq(("CBI10006", (Some("Himanshu Vasani"), None)), ("CBI10004", (Some("Sonam Petro"), Some(8500))),
  ("CBI10003", (None, Some(3000))))
    
spark.sparkContext
  .parallelize(data)
  .map { case (x, y) => (x, y._1.getOrElse(""), y._2.getOrElse("")) }
  .foreach(println)

// output: 
// (CBI10006,Himanshu Vasani,)
// (CBI10004,Sonam Petro,8500)
// (CBI10003,,3000)
Gabio
  • 9,126
  • 3
  • 12
  • 32