I have two tables that I'm working with using Py-spark
File 1:
Schema: CustomerName:STRING, DOB:STRING, UIN:STRING, MailID:STRING, PhoneNumber:LONG, City:STRING, State:STRING, LivingStatus:STRING, PinCode:STRING, LoanAmount:LONG
Sample Data: Sakshi, 22-03-86, UIN0043, Sakshi@mail.com, 3344990876, Ahmedabad, Gujarat, BPL ,380001, 23000 Shivani, 22-02-83, UIN0044, Shivani@mail.com, 3344990876, Thiruvananthpuram, Kerala, APL, 695001,24500
File 2:
schema: CustomerName:STRING, DOB:STRING,UIN:STRING, City:STRING, State:STRING, PinCode:LONG, CibilScore:LONG, DefaulterFlag:STRING
Sample data: Shubham, 23-08-86, UIN0007, Thiruvananthpuram, Kerala, 695001, 3530, N Anushka, 25-08-82, UIN0008, Thiruvananthpuram, Kerala, 695001, 1530, Y
I need to evaluate and apply the status to be Approved if the client is not a defaulter and the credit score is more than 800, using both pyspark core and SQL.
I'm new to this and have tried solving it using core and was getting wrong results.
I have tried solving the same problem using sql after loading the dataset into mysql db and got proper results. However, with pyspark core I'm not able to.