1

`Error : Type mismatch. Required: (sql.DataFrame, String) => sql.DataFrame, found: (sql.DataFrame, String) => Any

I am trying to traverse through all the columns in dataframe. so I have used foldLeft.Need to replace the data based on the following conditions: For eg: If the column type is of Integer, perform one operation and if column type is of another type, need to perform another operation..but getting type mismatch error if I use conditions inside foldLeft. Please someone assist.`

val actualDF = nonullDF
    .columns
    .foldLeft(nonullDF) { (memoDF, colName) =>
      if (memoDF.schema("colName").dataType == IntegerType) {
        memoDF.withColumn(
          colName,
          when(col("colName") === "?",
            (memoDF.select(avg("colName")).head().getInt(0)))
            .otherwise(col("colName")))
      }
      else if (memoDF.schema("colName").dataType == DoubleType) {
        memoDF.withColumn(
          colName,
          when(col("colName") === "?",
            (memoDF.select(avg("colName")).head().getDouble(0)))
            .otherwise(col("colName")))
      }
      else if (memoDF.schema("colName").dataType == StringType) {
        memoDF.withColumn(
          colName,
          when(col("colName") === "?", memoDF.groupBy(col("colName")).count().orderBy(desc("count")).first()(0))
            .otherwise(col("colName")))
      }
    }```

1 Answers1

1

That's because you missed one else block.

You can rewrite your code using pattern matching as shown below:

  nonullDF.columns
    .foldLeft(nonullDF) { (memoDF, colName) =>
      memoDF.schema(colName).dataType match {
        case _: StringType => // transformation goes here
        case _: IntegerType =>
        case _: DoubleType => 
        case _ => memoDF 
      }
    }
Mohana B C
  • 5,021
  • 1
  • 9
  • 28