-1

I am having a problem implementing searching in my app, I am using Jaro-Winkler algorithm to compare between the searching text and all the Strings in a list through this dart file:

import 'dart:math';

double jaroWinklerSimilarity(String s1, String s2) {
  // Preprocess strings by removing spaces and converting to lowercase
  final processedS1 = s1.replaceAll(' ', '').toLowerCase();
  final processedS2 = s2.replaceAll(' ', '').toLowerCase();

  final jaroScore = jaroSimilarity(processedS1, processedS2);
  final commonPrefixLength = _commonPrefixLength(processedS1, processedS2);

  // Adjust for common prefix
  final jaroWinklerScore =
      jaroScore + (commonPrefixLength * 0.1 * (1 - jaroScore));

  return jaroWinklerScore;
}

double jaroSimilarity(String s1, String s2) {
  final m = s1.length;
  final n = s2.length;

  if (m == 0 && n == 0) {
    return 1.0; // Both strings are empty, so they're completely similar
  }

  final matchDistance = (max(m, n) ~/ 2) - 1;
  final matches1 = List<bool>.filled(m, false);
  final matches2 = List<bool>.filled(n, false);
  var matchesCount = 0;

  for (var i = 0; i < m; i++) {
    final start = max(0, i - matchDistance);
    final end = min(n, i + matchDistance + 1);

    for (var j = start; j < end; j++) {
      if (!matches2[j] && s1[i] == s2[j]) {
        matches1[i] = true;
        matches2[j] = true;
        matchesCount++;
        break;
      }
    }
  }

  if (matchesCount == 0) {
    return 0.0; // No matches, similarity is 0
  }

  var transpositions = 0;
  var k = 0;

  for (var i = 0; i < m; i++) {
    if (matches1[i]) {
      while (!matches2[k]) {
        k++;
      }
      if (s1[i] != s2[k]) {
        transpositions++;
      }
      k++;
    }
  }

  return ((matchesCount / m) +
          (matchesCount / n) +
          ((matchesCount - transpositions / 2) / matchesCount)) /
      3;
}

int _commonPrefixLength(String s1, String s2) {
  final minLength = min(s1.length, s2.length);

  for (var i = 0; i < minLength; i++) {
    if (s1[i] != s2[i]) {
      return i;
    }
  }

  return minLength;
}

if for example I am searching for "Yacop" and I have "Omar Yacop", it won't return a high similarity score. How can I solve this issue? PS: I am setting my threshhold to 0.67

EDIT the only thing I can think of is if it didn't find any similarities, it searches with simple .contains() function, but I need a better and more accurate solution

Omar Yacop
  • 37
  • 1
  • 9
  • When using the `Flutter` framework, such algorithms are not used. Why did you add the `Flutter` tag? I would be ashamed if I asked a serious question, in anticipation of a serious answer, and in doing so added an irrelevant tag. What did you expect when you added this tag? – mezoni Aug 21 '23 at 18:29
  • Jaro-Winker specifically weighs inputs for similarity at or near the start of the string. If the word you are checking against is a closer match for the end of the string but not the beginning, a low similarity score is to be expected. For example, running your code against "Dylan" and "Peter Dylan" returns a score of 0 because even though the target string contains "Dylan", none of the letters in "Dylan" appear anywhere in the first half of the string. – Abion47 Aug 21 '23 at 19:08
  • I found a temporary solution by splitting the string @Abion47 – Omar Yacop Aug 22 '23 at 09:46
  • are you Indian @mezoni ? , why are you so angry XD – Omar Yacop Aug 22 '23 at 09:48
  • Human stupidity annoys me. The Jaro-Winkler algorithm is related to Flutter framework in the same way that Laravel framework is related to the Levenshtein distance string metric. – mezoni Aug 22 '23 at 12:49
  • and you think you are supreme, man be humble and just answer politely @mezoni – Omar Yacop Aug 22 '23 at 14:57

0 Answers0