I got 10 DataFrame
s with the same schema which I'd like to combine into one DataFrame
. Each DataFrame
is constructed using a sqlContext.sql("select ... from ...").cahce
, which means that technically, the DataFrame
s are not really calculated until it's time to use them.
So, if I run:
val df_final = df1.unionAll(df2).unionAll(df3).unionAll(df4) ...
will Spark calculate all these DataFrame
s in parallel or one by one (due to the dot operator)?
And also, while we're here - is there a more elegant way to preform a unionAll
on several DataFrame
s than the one I listed above?