0

In my function, I am returning a finalDF, a sequence of data frames. In the loop shown below, map returns Seq[DataFrame] and it is being stored in finalDF to be able to return to the caller, but in some cases where there is further processing, I would like to store the filtered dataframe for each iteration and pass it to next loop.

How do I do it? If I try to assign it to some temp val, it throws and error that expression of type Seq[unit] does not conform to expected type Seq[DataFrame].

var finalDF: Seq[DataFrame] =null

    for (i <- 0 until stop){
  finalDF=strataCount(i).map(x=> {
    df.filter(df(cols(i)) === x)

    //how to get the above data frame to pass on to the next computation?
    }
  )

}

Regards

チーズパン
  • 2,752
  • 8
  • 42
  • 63
Garipaso
  • 391
  • 2
  • 8
  • 22

1 Answers1

1

Maybe this is helpful:

val finalDF: Seq[DataFrame] = (0 to stop).flatMap(i => strataCount(i).map(x => df.filter(df(cols(i)) === x))).toSeq

flatMap to flatten the Seq(Seq).

(0 to stop) will loop from 0 to stop, flatMap will flatten List, Like:

scala> (0 to 20).flatMap(i => List(i))
res0: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
scala> (0 to 20).map(i => List(i)).flatten
res1: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)

for two counters, maybe you can do it like:

(0 to stop).flatMap(j => {
 (0 to stop).flatMap(i => strataCount(i).map(x => df.filter(df(cols(i)) === x)))
}).toSeq

or try: for yield, see: Scala for/yield syntax

Community
  • 1
  • 1
chengpohi
  • 14,064
  • 1
  • 24
  • 42
  • thanks, this has been really helpful , if you don't mind, can you please explain this syntax please? And one another thing, I would like to apply another loop for all the values in sequence finalDF, is it possible to specify two counters here? Thanks – Garipaso Nov 29 '16 at 07:42