I am trying to Create Third Table Using Two Table With Help Of Spark-Sql or PySpark (No Use of Panda(Python))
Dataframe One:
+---------+---------+------------+-----------+
| NAME | NAME_ID | CLIENT | CLIENT_ID |
+---------+---------+------------+-----------+
| RISHABH | 1 | SINGH | 5 |
| RISHABH | 1 | PATHAK | 3 |
| RISHABH | 1 | KUMAR | 2 |
| KEDAR | 2 | PATHAK | 3 |
| KEDAR | 2 | JADHAV | 1 |
| ANKIT | 3 | SRIVASTAVA | 6 |
| ANKIT | 3 | KUMAR | 2 |
| SUMIT | 4 | SINGH | 5 |
| SUMIT | 4 | SHARMA | 4 |
+---------+---------+------------+-----------+
Dataframe Two:
| NAME | NAME_ID | CLIENT | CLIENT_ID |
| RISHBAH | _____ | SRIVASTAVA | _____ |
| KEDAR | _____ | KUMAR | _____ |
| RISHABH | _____ | SINGH | _____ |
| KEDAR | _____ | PATHAK | _____ |
###Require Dataframe Output:###
+---------+---------+------------+-----------+
| NAME | NAME_ID | CLIENT | CLIENT_ID |
| RISHBAH | 1 | SRIVASTAVA | 6 |
| KEDAR | 2 | KUMAR | 2 |
| RISHABH | 1 | SINGH | 5 |
| KEDAR | 2 | PATHAK | 3 |
Using Spark-Sql or Spark.
Tried With df1.join(df2,df1.NAME == df2.NAME,"left")
But I am Not Getting The Output As Required.