I have the following dataset:
Name Address Bank_Account Ph_NO IP_Address Chargeoff
AJ 12 ABC Street 1234 369 12.12.34 0
CK 12 ABC Street 1234 450 12.12.34 1
DN 15 JMP Street 3431 569 13.8.09 1
MO 39 link street 8421 450 05.67.89 1
LN 12 ABC Street 1234 340 14.75.06 1
ST 15 JMP Street 8421 569 13.8.09 0`
Using this dataset I want to create the below view in SAS:
Name CountOFAddr CountBankacct CountofPhone CountOfIP CountCharegeoff
AJ 3 3 1 2 2
CK 3 3 2 2 3
DN 2 1 2 2 1
MO 1 2 2 1 2
LN 3 3 1 1 2
ST 2 2 2 2 2
The output variables indicates as follows :
-CountOfAddr : For AJ countOFAddr is 3 which means that AJ Shares its address with itself, CK and LN
-CountBankAcct : For MO count of BankAcct is 2 which means that MO Shares its bank account number with itself and ST.Similarly for variables CountofPhone and CountOfIP.
-CountChargeoff: This one is a little tricky it basically implies that AJ is Linked to CK And LN through address...and both CK and LN have been charged off so the countChargeoff for AJ is 2.
For CK
the countChargeOff
is 3
because it is linked with itself, MO
through Bank Account, and LN/AJ
through street address...so total chargeoff
in CK's
Network is 3
(CO count of AJ+CO
count of CK+CO
Count of MO+CO
count of LN
)
I currently work as a Risk Analyst in a Financial Service Firm and the code for this problem may help us to significantly reduce funding of fraudulent accounts.
Thanks.