6

How can is sort a list with umlauts? The compareTo-Function does ignore these an place the word at the end of the list.

main() {
  var fruits = ['Äpfel', 'Bananen', 'Orangen'];
  fruits.sort((a, b) => a.compareTo(b));
}
Toby
  • 190
  • 1
  • 13
  • Are you sure it really ignores the umlaut? It may be sorting on the utf-8 byte sequence instead. Would it be an option to replace 'Ä' by either 'A' or 'AE' (depending on how you want to sort) for the comparison. Not terribily efficient but if you only have a few strings to sort then this may be easiest. – Daniel Junglas May 04 '19 at 21:26
  • @DanielJunglas replacing the letters maybe work, but i don't want it. There must be some way. :-) – Toby May 04 '19 at 23:29
  • I don't know dart but I guess it just compares the sequences of unicode codepoints. And of course `compareTo` must distinguish 'Ä' and 'A' since otherwise "Äpfel" and "Apfel" would be the same. One way or another you will need a custom comparator that treats 'Ä' as 'A'. Replacing the characters is a brute force approach that works. If you don't want that you may want to create a comparator that has the required special behavior. – Daniel Junglas May 05 '19 at 06:38

5 Answers5

9

There is a package called diacritic that make it easy.

https://pub.dev/packages/diacritic

Just put the dependency in pubspec.yml and use it.

dependencies:
  diacritic: ^0.1.1

Import it:

import 'package:diacritic/diacritic.dart';

And use it:

var fruits = ['Äpfel', 'Bananen', 'Orangen'];
fruits.sort((a, b) => removeDiacritics(a).compareTo(removeDiacritics(b)));
print(fruits);

The output will be:

[Äpfel, Bananen, Orangen]
Cassio Seffrin
  • 7,293
  • 1
  • 54
  • 54
6

I've created custom comparator based on @DanielJunglas idea. His code wasn't working properly for Polish language and i think it wasn't good for German too. It's for Polish language but can be used in any language, you just need to change letters in map for your alphabet. It ignore's all the other non alphabet sign's. If you want for example to have "+" sign to be compared too and to have it before every other letter just add it to map table with lowest number. If anyone have easier method please tell me. I'm a beginner too.

main() {

var sample = ['Ónda', 'Orangen', 'Ąpfel', 'Okren', 'żarcik', 'Banen', 'Alpa', 'łąka', 'źrebak'];

PolishLettersCompare cmp = new PolishLettersCompare();
  sample.sort(
    (a, b) => cmp.compare(a, b)
  );
}

class PolishLettersCompare {
final Map<String, int> map = {
"A": 1,
"a": 2,
"Ą": 3,
"ą": 4,
"B": 5,
"b": 6,
"C": 7,
"c": 8,
"Ć": 9,
"ć": 10,
"D": 11,
"d": 12,
"E": 13,
"e": 14,
"Ę": 15,
"ę": 16,
"F": 17,
"f": 18,
"G": 19,
"g": 20,
"H": 21,
"h": 22,
"I": 23,
"i": 24,
"J": 25,
"j": 26,
"K": 27,
"k": 28,
"L": 29,
"l": 30,
"Ł": 31,
"ł": 32,
"M": 33,
"m": 34,
"N": 35,
"n": 36,
"Ń": 37,
"ń": 38,
"O": 39,
"o": 40,
"Ó": 41,
"ó": 42,
"P": 43,
"p": 44,
"R": 45,
"r": 46,
"S": 47,
"s": 48,
"Ś": 49,
"ś": 50,
"T": 51,
"t": 52,
"U": 53,
"u": 54,
"V": 55,
"v": 56,
"W": 57,
"w": 58,
"X": 59,
"x": 60,
"Y": 61,
"y": 62,
"Z": 63,
"z": 64,
"Ż": 65,
"ż": 66,
"Ź": 67,
"ź": 68,
};
int charAint;
int charBint;
int compare(String a, String b) {
int min = a.length;
if (b.length < a.length) min = b.length;
for (int i = 0; i < min; ++i) {
  String charA = a[i];
  String charB = b[i];
  if (map.containsKey(charA)) {
    charAint = map[charA];
  }
  if (map.containsKey(charB)) {
    charBint = map[charB];
  }
  if (charAint > charBint)
    return 1;
  else if (charAint < charBint) return -1;
}
if (a.length < b.length)
  return -1;
else if (a.length > b.length) return 1;
return 0;
}
}
Hodofca
  • 199
  • 1
  • 1
  • 7
  • In my opinion this is the best solution, because in some languages (like Czech) the order of special characters in the alphabet is exactly given and it is not possible to just replace them with basic characters for correct ordering (as shown in other solutions). – Kamil Svoboda Oct 07 '21 at 06:12
1

This is the first program in dart I ever wrote, so maybe there are better ways to do this. But at least it works:

// Class for comparing strings in an umlaut-agnostic way.
class UmlautCompare {
  Map<int,int> map;
  UmlautCompare() {
    map = new Map<int,int>();
    String umlauts = 'ÄÖÜäöü';
    String mapped  = 'AOUaou';
    for (int i = 0; i < umlauts.length; ++i) {
      map[umlauts.codeUnitAt(i)] = mapped.codeUnitAt(i);
    }
  }
  // Compare two strings treating umlauts as the respective non-umlaut characters.
  int compare(String a, String b) {
    int min = a.length;
    if (b.length < a.length) min = b.length;
    for (int i = 0; i < min; ++i) {
        int charA = a.codeUnitAt(i);
        int charB = b.codeUnitAt(i);
      if (map.containsKey(charA))
        charA = map[charA];
      if (map.containsKey(charB))
        charB = map[charB];
        if (charA < charB) return -1;
        else if (charA > charB) return 1;
    }
    // If we get here then the first min characters are equal.
    // The strings are equal if they have the same length.
    // If they have different length then the shorter string is considered less.
    if (a.length < b.length) return -1;
    else if (a.length > b.length) return 1;  
    return 0;
  }
}

main() {
  UmlautCompare cmp = new UmlautCompare();
  var fruits = ['Orangen', 'Äpfel', 'Bananen'];
  fruits.sort((a, b) => cmp.compare(a, b));//compareUmlaut(a, b));
  print(fruits);
}
Daniel Junglas
  • 5,830
  • 1
  • 5
  • 22
1

This class supports German, Turkish and Frensh special characters:

It converts all strings to lower case, but you could easily fix that if you need to.

class Angliciser {
  static const Map _letterConversion = {
    // German characters
    "ä": "a",
    "ö": "o",
    "ü": "u",

    // Turkish characters (omitting already existing ones)
    "ğ": "g",
    "i̇": "i", // these are not the same
    "ş": "s",

    // French characters (omitting already existing ones)
    "ç": "c",
    "à": "a",
    "â": "a",
    "è": "e",
    "é": "e",
    "ê": "e",
    "ë": "e",
    "î": "i",
    "ï": "i",
    "ô": "o",
    "œ": "o",
    "ù": "u",
    "û": "u",
    "ÿ": "y",
  };

  static String convert(String str) {
    if (str == null || str.isEmpty) return str;

    final converted = [];
    var sourceSymbols = [];

    sourceSymbols = str.toLowerCase().split('');

    for (final element in sourceSymbols) {
      converted.add(_letterConversion.containsKey(element) ? _letterConversion[element] : element);
    }

    return converted.join();
  }
}

Mamo1234
  • 958
  • 1
  • 12
  • 24
1

This is my modified version of the previous answers:

import 'dart:math';
static int compare(String a, String b) {
var letters = [
  "a",
  "ą",
  "b",
  "c",
  "ć",
  "d",
  "e",
  "ę",
  "f",
  "g",
  "h",
  "i",
  "j",
  "k",
  "l",
  "ł",
  "m",
  "n",
  "ń",
  "o",
  "ó",
  "p",
  "q",
  "r",
  "s",
  "ś",
  "t",
  "u",
  "v",
  "w",
  "x",
  "y",
  "z",
  "ż",
  "ź",
];

a = a.toLowerCase();
b = b.toLowerCase();

for (var i = 0; i < min(a.length, b.length); i++) {
  var aValue = letters.indexOf(a[i]);
  var bValue = letters.indexOf(b[i]);

  var result = (aValue - bValue).sign;
  if (result != 0) {
    return result;
  }
}

return (a.length - b.length).sign;

}

It possibly performs worse but it's easier to read.

coldandtired
  • 993
  • 1
  • 10
  • 13