4

I am dealing with anagrams so I'm concerned only with the characters present in the string but not their order. I searched for a suitable Collection class but in vain.

Can you please suggest any class that could help me to keep duplicates but ignores order?

6006604
  • 7,577
  • 1
  • 22
  • 30
Sriman
  • 87
  • 1
  • 5
  • [Guava Multiset](https://guava.dev/releases/23.0/api/docs/com/google/common/collect/Multiset.html)? Or simply a `Map`, where `Integer` stores the number of occurrences of the key? – Andy Turner Oct 02 '19 at 12:10
  • Actually if you want to check the anagrams sorting collections are usefull also. As long as they keep the duplicates. Thats true cause "ccba" is the same with "cabc" due to anagram but also similar to "abcc" due to anagram. Thus sorting both will end up in "abcc" twice which are equals! – Michael Michailidis Oct 02 '19 at 14:31

3 Answers3

8

You can use a Map<Character,Integer> to count the number of occurrences of each character of a String. If the Maps generated for two Strings are equal, you'll know that the corresponding Strings are anagrams.

For example (here I used Map<Integer,Long> instead of Map<Character,Integer> since it was more convenient):

String one = "animal";
String two = "manila";
Map<Integer,Long> mapOne = one.chars ().boxed().collect(Collectors.groupingBy(Function.identity(),Collectors.counting()));
Map<Integer,Long> mapTwo = two.chars ().boxed().collect(Collectors.groupingBy(Function.identity(),Collectors.counting()));
System.out.println ("Is anagram? " + mapOne.equals(mapTwo));

Output:

Is anagram? true
Eran
  • 387,369
  • 54
  • 702
  • 768
1

You can use Google guava's HashMultiSet. The equals() method does exactly that:

Compares the specified object with this multiset for equality. Returns true if the given object is also a multiset and contains equal elements with equal counts, regardless of order. This implementation returns true if object is a multiset of the same size and if, for each element, the two multisets have the same count.

Shloim
  • 5,281
  • 21
  • 36
1

Instead of an ordered data structure, one can also dynamically sort the data.

As Unicode symbols, code points, are better than UTF-16 chars, I'll use Unicode ints instead:

int[] canonical(String s) {
    return s.codePoints().sorted().toArray();
}

boolean isAnagram(String s, String t) {
    return Arrays.equals(canonical(s), canonical(t));
}

boolean isAnagram(int[] s, String t) {
    return Arrays.equals(s, canonical(t));
}
Joop Eggen
  • 107,315
  • 7
  • 83
  • 138