5

How merge 3 DataFrame in Spark-Scala? I completly don't have any Idea how I can make this. On stackOverFlow I can't found similar example.

I have 3 similar DataFrames. The same name of Column, and the same number of them. Difference is only a value on rows.

DataFrame1:

+----+------+----+---+
|type| Model|Name|ID |
+----+------+----+---+
|  1 |wdasd |xyzd|111|
|  1 |wd    |zdfd|112|
|  1 |bdp   |2gfs|113|
+----+------+----+---+

DataFrame2:

+----+------+----+---+
|type| Model|Name|ID |
+----+------+----+---+
|  2 |wdasd |xyzd|221|
|  2 |wd    |zdfd|222|
|  2 |bdp   |2gfs|223|
+----+------+----+---+

DataFrame3:

+----+------+----+---+
|type| Model|Name|ID |
+----+------+----+---+
|  3 |AAAA  |N_AM|331|
|  3 |BBBB  |NA_M|332|
|  3 |CCCC  |MA_N|333|
+----+------+----+---+

And I want to this type of DataFrame

MergeDataFrame:

+----+------+----+---+
|type| Model|Name|ID |
+----+------+----+---+
|  1 |wdasd |xyzd|111|
|  1 |wd    |zdfd|112|
|  1 |bdp   |2gfs|113|
|  2 |wdasd |xyzd|221|
|  2 |wd    |zdfd|222|
|  2 |bdp   |2gfs|223|
|  3 |AAAA  |N_AM|331|
|  3 |BBBB  |NA_M|332|
|  3 |CCCC  |MA_N|333|
+----+------+----+---+
vindev
  • 2,240
  • 2
  • 13
  • 20
Svs
  • 87
  • 1
  • 1
  • 4

1 Answers1

13

Spark provides a union and unionAll. Looks like they are deprecating the unionAll function so I would use the union function as below:

dataFrame1.union(dataFrame2).union(dataFrame3)

Note that in order to union data frames the data frames must have the exact same column names in the exact same order.

See the spark docs here

Steve Robinson
  • 452
  • 5
  • 9