I have two tab separated data files like below:
file 1:
number type data_present
1 a yes
2 b no
file 2:
type group number recorded
d aa 10 true
c cc 20 false
I want to merge these two files so that output file looks like below:
number type data_present group recorded
1 a yes NULL NULL
2 b no NULL NULL
10 d NULL aa true
20 cc NULL cc false
As you can see, for columns which are not present in other file, I'm filling those places with NULL.
Any ideas on how to do this in Scala/Spark?