I have been trying to use the Sim-metrics library from:
<dependency>
<groupId>com.github.mpkorstanje</groupId>
<artifactId>simmetrics-core</artifactId>
<version>4.1.0</version>
</dependency>
So far I am computing Jaro Winkler using:
StringMetric sm = StringMetrics.jaroWinkler();
res = sm.compare("Harry Potter", "Potter Harry");
System.out.println(res);
0.43055558
and Cosine Similarity by:
sm = StringMetrics.overlapCoefficient();
res = sm.compare("The quick brown fox", "The slow brawn fur");
System.out.println(res);
0.25
but according to https://asecuritysite.com/forensics/simstring
The jaro-winkler should be 0 for this, and the overlap coeffecient should be 100. Is this even the correct way to use this library? What is the proper calls, say if I want to run both these metrics to match movies from one list to another I got from IMDB, I am intending to compare the titles from both sets and get the average of both scores and do the same for the cast from both sets of movies. Thanks