0

I'm new to PySpark and need to compare two files based on col1 alone and populate new colum at end of file 1 based on matching conditions.

1 - Matching record 0 - Unmatached Record

File1:

Col1 Col2 ... ColN
1 abc ... Xxxx
2 abc ... Xxxx
3 abc ... Xxxx

File 2

Col1 Col2 ... ColN
1 abc ... Xxxx
2 abc ... Xxxx

Expected output:

Col1 Col2 ... ColN Newcol
1 abc ... Xxxx 1
2 abc ... Xxxx 1
3 abc ... Xxxx 0
Naji
  • 1
  • 1
  • Learn how `join` works on dataframes. This is really basic question. – ZygD Aug 05 '22 at 13:54
  • Does this answer your question? [Check if values of column pyspark df exist in other column pyspark df](https://stackoverflow.com/questions/65031917/check-if-values-of-column-pyspark-df-exist-in-other-column-pyspark-df) – ZygD Aug 05 '22 at 13:54

0 Answers0