I have a dataframe with different columns like :
1) cust mobile phone no
2) cust home phone
3) cust nextkin phone
4) cust fax
5) cust id
In my output dataframe I would like to have columns like :
1) cust id
2) cust phone 1
3) cust phone 2
4) cust phone 3
5) cust phone 4
The mapping between input and output phone numbers is as follows (but there is also a priority logic):
cust phone 1 = cust mobile phone no
cust phone 2 = cust home phone
cust phone 3 = cust nextkin phone
cust phone 4 = cust fax
Note that any of these could be blank in input dataframe. The priority logic says that if one is blank , then next available phone number should assigned to that phone column. So if cust phone 2 is blank but cust phone 3 is available, then cust phone 2 should be assigned the value and so on. Also, cust phone 1 to cust phone 4 should all be unique (no duplicates).
Since the dataframe is large, iterating through rows is not an option.
Here is a sample data frame:
df = pd.DataFrame({'cust mobile no': ['1', '2', '3'],
'cust home phone': [np.nan, '2', 'x'],
'cust nextkin phone': ['1', '2', 'g'],
'cust fax': [np.nan, '4', '5'],
'cust id': ['001', '002', '003']})
cust mobile no cust home phone cust nextkin phone cust fax cust id
0 1 NaN 1 NaN 001
1 2 2 2 4 002
2 3 x g 5 003
Expected output :
cust id cust phone 1 cust phone 2 cust phone 3 cust phone 4
0 001 1 NaN NaN NaN
1 002 2 4 NaN NaN
2 003 3 x g 5