I am attempting to validate the R function stringdist
from library stringdist
.
Using example
1 - stringdist('John J Smith', 'John Smith', method = 'jw', p = 0)
it returns 0.9444444
Where p = 0
implies that the Winkler component of Jaro-Winkler is not used.
I am attempting to reproduce this result using the formula shown in Wikipedia and this documentation however just can't seem to get my manual calculation to align.
In my example below, there are 5 half transpositions and so floor(5/2)
results in t = 2
.
There are 10 matching characters, ensuring that the distance between the matching characters is not greater than 5.
The resulting calculation is: