To find the similarity between two words\sentences you may want to use somthing like Edit Distance or Jaccard Distance.
Let's test it in your case using Edit Distance :
firstName = ['Ali', 'Hamada', '3ly', '7amada', 'Sophia', 'Sofiya', 'Matthieu', 'Matthieu', 'Mathew']
#No need to implement the distance function, you can call it from NLTK
import nltk
# Find similier first name using edit distance
for name in firstName:
nameToCompare = [x for x in firstName if x != name]
for n in nameToCompare:
print(name, n, nltk.edit_distance(name, n))
print('***************')
# Ali Hamada 6
# Ali 3ly 2
# Ali 7amada 6
# Ali Sophia 5
# Ali Sofiya 5
# Ali Matthieu 7
# Ali Matthieu 7
# Ali Mathew 6
#***************
# Hamada Ali 6
# Hamada 3ly 6
# Hamada 7amada 1
# Hamada Sophia 5
# Hamada Sofiya 5
# Hamada Matthieu 7
# Hamada Matthieu 7
# Hamada Mathew 5
#***************
# 3ly Ali 2
# 3ly Hamada 6
# 3ly 7amada 6
# 3ly Sophia 6
# 3ly Sofiya 5
# 3ly Matthieu 8
# 3ly Matthieu 8
# 3ly Mathew 6
#***************
# 7amada Ali 6
# 7amada Hamada 1
# 7amada 3ly 6
# 7amada Sophia 5
# 7amada Sofiya 5
# 7amada Matthieu 7
# 7amada Matthieu 7
# 7amada Mathew 5
#***************
# Sophia Ali 5
# Sophia Hamada 5
# Sophia 3ly 6
# Sophia 7amada 5
# Sophia Sofiya 3
# Sophia Matthieu 6
# Sophia Matthieu 6
# Sophia Mathew 5
#***************
# Sofiya Ali 5
# Sofiya Hamada 5
# Sofiya 3ly 5
# Sofiya 7amada 5
# Sofiya Sophia 3
# Sofiya Matthieu 7
# Sofiya Matthieu 7
# Sofiya Mathew 6
#***************
# Matthieu Ali 7
# Matthieu Hamada 7
# Matthieu 3ly 8
# Matthieu 7amada 7
# Matthieu Sophia 6
# Matthieu Sofiya 7
# Matthieu Mathew 3
#***************
# Matthieu Ali 7
# Matthieu Hamada 7
# Matthieu 3ly 8
# Matthieu 7amada 7
# Matthieu Sophia 6
# Matthieu Sofiya 7
# Matthieu Mathew 3
#***************
# Mathew Ali 6
# Mathew Hamada 5
# Mathew 3ly 6
# Mathew 7amada 5
# Mathew Sophia 5
# Mathew Sofiya 6
# Mathew Matthieu 3
# Mathew Matthieu 3
#***************
The small numbers means it's more similar. You can noticed that it can identify the similar mane with different spelling.
Now let's apply Jaccard Distance
for name in firstName:
nameToCompare = [x for x in firstName if x != name]
for n in nameToCompare:
print(name, n, (1-nltk.jaccard_distance(set(name), set(n)))*100)
print('***************')
# Ali Hamada 0.0
# Ali 3ly 19.999999999999996
# Ali 7amada 0.0
# Ali Sophia 12.5
# Ali Sofiya 12.5
# Ali Matthieu 11.111111111111116
# Ali Matthieu 11.111111111111116
# Ali Mathew 0.0
#***************
# Hamada Ali 0.0
# Hamada 3ly 0.0
# Hamada 7amada 60.0
# Hamada Sophia 11.111111111111116
# Hamada Sofiya 11.111111111111116
# Hamada Matthieu 9.999999999999998
# Hamada Matthieu 9.999999999999998
# Hamada Mathew 11.111111111111116
#***************
# 3ly Ali 19.999999999999996
# 3ly Hamada 0.0
# 3ly 7amada 0.0
# 3ly Sophia 0.0
# 3ly Sofiya 12.5
# 3ly Matthieu 0.0
# 3ly Matthieu 0.0
# 3ly Mathew 0.0
#***************
# 7amada Ali 0.0
# 7amada Hamada 60.0
# 7amada 3ly 0.0
# 7amada Sophia 11.111111111111116
# 7amada Sofiya 11.111111111111116
# 7amada Matthieu 9.999999999999998
# 7amada Matthieu 9.999999999999998
# 7amada Mathew 11.111111111111116
#***************
# Sophia Ali 12.5
# Sophia Hamada 11.111111111111116
# Sophia 3ly 0.0
# Sophia 7amada 11.111111111111116
# Sophia Sofiya 50.0
# Sophia Matthieu 30.000000000000004
# Sophia Matthieu 30.000000000000004
# Sophia Mathew 19.999999999999996
#***************
# Sofiya Ali 12.5
# Sofiya Hamada 11.111111111111116
# Sofiya 3ly 12.5
# Sofiya 7amada 11.111111111111116
# Sofiya Sophia 50.0
# Sofiya Matthieu 18.181818181818176
# Sofiya Matthieu 18.181818181818176
# Sofiya Mathew 9.090909090909093
#***************
# Matthieu Ali 11.111111111111116
# Matthieu Hamada 9.999999999999998
# Matthieu 3ly 0.0
# Matthieu 7amada 9.999999999999998
# Matthieu Sophia 30.000000000000004
# Matthieu Sofiya 18.181818181818176
# Matthieu Mathew 62.5
#***************
# Matthieu Ali 11.111111111111116
# Matthieu Hamada 9.999999999999998
# Matthieu 3ly 0.0
# Matthieu 7amada 9.999999999999998
# Matthieu Sophia 30.000000000000004
# Matthieu Sofiya 18.181818181818176
# Matthieu Mathew 62.5
#***************
# Mathew Ali 0.0
# Mathew Hamada 11.111111111111116
# Mathew 3ly 0.0
# Mathew 7amada 11.111111111111116
# Mathew Sophia 19.999999999999996
# Mathew Sofiya 9.090909090909093
# Mathew Matthieu 62.5
# Mathew Matthieu 62.5
#***************
Also we have great results!
Hope this help