I’m trying to write a function that will read in a large base table (example below) and check if any of the unique entities (ID) can be linked via 15+ attributes (bank a/c, phone num, email, zip code…etc). No fuzzy matching required this time.
df <- data.frame( id = c('01','02','03','04','05','06','07','08','09','10'),
bank_acc=c('66201','66202','66203','66204','66205','66205','66205','66206','66207','66208'),
phone_num=c('10151','10150','10152','10150','10153','10150','10154','10155','10156','10157'))
I need the output in an edgelist format (example below) so I can input into igraph, I plan to use the “Method” column to colour code the edges. Thanks in advance
ID Linked_ID Method
05 06 bank_acc
05 07 bank_acc
06 07 bank_acc
02 04 phone_num
02 06 phone_num