I have a dataset that I've created from merging 2 df's together on the "NAME" column and now I have a larger dataset. To finish the DF, I want to perform some logic to it to clean it up.
Requirements: I want to select the unique 'NAME' but I want to match the name with the highest Sales row, and if after going though the Sales column, all rows are less than 10, then move to the Calls column and select highest the row with the highest Call, and if all calls in the 'CALLS' are less than 10 then move to the Target Column select the highest Target. No rows are summed.
Here's my DF:
NAME CUSTOMER_SUPPLIER_NUMBER Sales Calls Target
0 OFFICE 1 2222277 84 170 265
1 OFFICE 1 2222278 26 103 287
2 OFFICE 1 2222278 97 167 288
3 OFFICE 2 2222289 7 167 288
4 OFFICE 2 2222289 3 130 295
5 OFFICE 2 2222289 9 195 257
6 OFFICE 3 1111111 1 2 286
7 OFFICE 3 1111111 5 2 287
8 OFFICE 3 1111112 9 7 230
9 OFFICE 4 1111171 95 193 299
10 OFFICE 5 1111191 9 193 298
Here's what I want to show in the final DF:
NAME CUSTOMER_SUPPLIER_NUMBER Sales Calls Target
0 OFFICE 1 2222277 97 167 288
5 OFFICE 2 2222289 9 195 257
7 OFFICE 3 1111111 5 2 287
9 OFFICE 4 1111171 95 193 299
10 OFFICE 5 1111191 9 193 298
I was thinking of solving this by using df.itterows()
Here's what I've tried:
for n, v in df.iterrows():
if int(v['Sales']) > 10:
calls = df.loc[(v['NAME'] == v) & (int(v['Calls'].max()))]
if int(calls['Calls']) > 10:
target = df.loc[(v['NAME'] == v) & (int(v['Target'].max()))]
else:
print("No match found")
else:
sales = df.loc[(v['NAME'] == v) & (int(v['Sales'].max())]
However, I keep getting KeyError: False
error messages. Any thoughts on what I'm doing wrong?