Using Spark 2.0.2.6 and Scala 2.11.8: I've been coming across an exception java.lang.IllegalArgumentException: spark.sql.execution.id is already set
when trying to do a .collect() / .show() on a DataFrame when executing through a Future
.
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.{Await, Future}
def printTable(query: String): Unit = {
try {
spark.sql(query).show
}
catch { case e: Exception => println(e) }
}
Future {
printTable("SELECT key1, key2 FROM schema.table1 LIMIT 1")
}
Future {
printTable("SELECT key1, key2 FROM schema.table2 LIMIT 1")
}
Future {
printTable("SELECT key1, key2 FROM schema.table3 LIMIT 1")
}
Also: the error is in no way consistent and the code sometimes prints back the right answer.
What's going on here and how do I fix it?