I am new to R, and want to compare 2 strings(addresses) where
Word order could be different, other than numbers. (Consecutive numbers need to be in same order)
Words could be at times in short form, eg street could be st., North West could be North W.
1 string could contain a word or 2 extra(rest of the words would be same)
There sometimes could be space in a word in 1 of the srings eg Pitampura -> Pitam pura.
eg
S1 = QU 23/24 Shalimar Bagh, Pitampura, Street no. 22, delhi
S2 = QU Flat 23/24 Pitam Pura, St. No. 22, Shalimar Bagh, Delhi
So far, I have removed the special characters, whitespaces, redundant words in the address.
Would a distance formula like cosine or levenshtein distance, be a good choice. If yes, how to apply the same in R without using any package.
Don't have liberty to install any external package.
Thanks in advance.