6

I’m going through a permutation/anagram problem and wanted input on the most efficient means of checking. Now, I’m doing this in Java land, and as such there is a library for EVERYTHING including sorting. The first means of checking if two string are anagrams of each other is to check length, sort them in some manner, then compare each index of said string. Code below:

private boolean validAnagram(String str, String pair) {
if(str.length() != pair.length()){
    return false;
}

char[] strArr = str.toCharArray();
char[] pairArr = pair.toCharArray();


Arrays.sort(strArr);
str = new String(strArr);

Arrays.sort(pairArr);
pair = new String(pairArr);

for(int i = 0; i<str.length(); i++){
    if(str.charAt(i) != pair.charAt(i)){
        return false;
    }
}
return true;
}

Alternatively, I figured it would be easier to check based on ascii value and avoid a check on every possible character. Code below:

private boolean validAnagram(String str, String pair) {
if(str.length() != pair.length()){
    return false;
}

char[] strArr = str.toCharArray();
char[] pairArr = pair.toCharArray();



int strValue = 0;
int pairValue = 0;

for(int i =0; i < strArr.length; i++){
    strValue+= (int) strArr[i];
    pairValue+= (int) pairArr[i];
}

if(strValue != pairValue){
    return false;
}
return true;
}

So, which is a better solution? I don’t know much about the sort that Arrays is giving me, however that’s the more common answer when I look around the old internets. Makes me wonder if I’m missing something.

ForguesR
  • 3,558
  • 1
  • 17
  • 39
Drew L. Facchiano
  • 283
  • 3
  • 5
  • 12

13 Answers13

4

There are several ways to check whether two strings are anagrams or not . Your question is , which one is better solution . Your first solution has sorting logic. Sorting has worst case complexity of (nlogn) . Your second logic is only using one loop which has complexity O(n) .

So out of this two , your second solution which is having only O(n) complexity will be a better solution than first one .

One possible solution :

private boolean checkAnagram(String stringOne , String stringTwo){
        char[] first = stringOne.toLowerCase().toCharArray(); 
        char[] second = stringTwo.toLowerCase().toCharArray();
        // if length of strings is not same 
        if (first.length != second.length)
            return false;
        int[] counts = new int[26]; 
        for (int i = 0; i < first.length; i++){
            counts[first[i]-97]++;  
            counts[second[i]-97]--;   
        }
        for (int i = 0; i<26; i++)
            if (counts[i] != 0)
                return false;
        return true;
    }

  • Hey Pratik! That was my initial thought. However, it's been pointed out to me that my ascii solution has a major problem. It's possible to get the wrong solutions based on certain combination. Pointed out by this fine fellow on Reddit "if you give it the strings AD and BC. The first has ascii values 65 and 68, the second has values 66 and 67. They both sum up to 133 and would be treated as equal by your algorithm." It seem's there are work arounds, however. For the sake of the problem it doesn't seem worth the fix for the edge cases. – Drew L. Facchiano Jul 06 '16 at 17:24
  • full post here : https://www.reddit.com/r/learnprogramming/comments/4rjg9x/which_is_the_better_anagram_solution/ – Drew L. Facchiano Jul 06 '16 at 17:24
  • You can use another approach which uses hashmap . – Pratik Upacharya Jul 06 '16 at 17:26
  • I can see that. Mapping every character to a boolean and then comparing the two maps. That still seems like you' d get a worse runtime then the sort approach. – Drew L. Facchiano Jul 06 '16 at 17:37
  • I have added a solution , please have a look . Sorting approach will have (nlogn) complexity in worst case , but this approach will have o(n) complexity in worst case . – Pratik Upacharya Jul 06 '16 at 17:40
  • It will crash if there is a space in the string. You should remove them first. Something like `char[] first = stringOne.toLowerCase().replaceAll("\\s+","").toCharArray();` – ForguesR Jul 06 '16 at 17:54
  • It will also crash if there is anything else than letters (a to z or A to Z) in the string. – ForguesR Jul 06 '16 at 20:16
4

This is a much simpler, easy-to-read solution I was able to compile...

    static boolean isAnagram(String a, String b) {
    if (a.length() == b.length()){
        char[] arr1 = a.toLowerCase().toCharArray();
        char[] arr2 = b.toLowerCase().toCharArray();
        Arrays.sort(arr1);
        Arrays.sort(arr2);
        if (Arrays.equals(arr1, arr2)) return true;
        else return false;
    }else return false;
}

Best, Justin

3

Here is a very simple implementation.

public boolean isAnagram(String strA, String strB) {
  // Cleaning the strings (remove white spaces and convert to lowercase)
  strA = strA.replaceAll("\\s+","").toLowerCase();
  strB = strB.replaceAll("\\s+","").toLowerCase();

  // Check every char of strA and removes first occurence of it in strB
  for (int i = 0; i < strA.length(); i++ ) {
    if (strB.equals("")) return false;  // strB is already empty : not an anagram
    strB = strB.replaceFirst(Pattern.quote("" + strA.charAt(i)), "");
  }

  // if strB is empty we have an anagram
  return strB.equals("");
}

And finally :

System.out.println(isAnagram("William Shakespeare", "I am a weakish speller")); // true
ForguesR
  • 3,558
  • 1
  • 17
  • 39
1

The best solution depends on your objective, code size, memory footprint or least computation.

A very cool solution, less code as possible, not being the fastest O(nlog n) and pretty memory inefficient in Java 8 :

public class Anagram {
  public static void main(String[] argc) {
    String str1 = "gody";
    String str2 = "dogy";

    boolean isAnagram =
    str1.chars().mapToObj(c -> (char) c).sorted().collect(Collectors.toList())
    .equals(str2.chars().mapToObj(c -> (char) c).sorted().collect(Collectors.toList()));

    System.out.println(isAnagram);
  }
}
  • This solution has some faults. According to your solution, you sort chars from Strings received in method's parameters but you don't ignore blank spaces and uppercase, so for example: "isAnagram("William Shakespeare", "I am a weakish speller")" mentioned above returns false instead of true. – K.Rzepecka Apr 30 '17 at 16:39
1

I tried a few solutions using Sets, and made each one run 10 million times to test using your example array of:

private static String[] input = {"tea", "ate", "eat", "apple", "java", "vaja", "cut", "utc"};

Firstly, the method i used to call these algotirhms:

public static void main(String[] args) {
    long startTime = System.currentTimeMillis();
    for (int x = 0; x < 10000000; x++) {
        Set<String> confirmedAnagrams = new HashSet<>();
        for (int i = 0; i < (input.length / 2) + 1; i++) {
            if (!confirmedAnagrams.contains(input[i])) {
                for (int j = i + 1; j < input.length; j++) {
                        if (isAnagrams1(input[i], input[j])) {
                            confirmedAnagrams.add(input[i]);
                            confirmedAnagrams.add(input[j]);
                        }
                }
            }
        }
        output = confirmedAnagrams.toArray(new String[confirmedAnagrams.size()]);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("Total time: " + (endTime - startTime));
    System.out.println("Average time: " + ((endTime - startTime) / 10000000D));
}

I then used algorithms based on a HashSet of characters. I add each character of each word to the HashSet, and should the HashSet not be the length of the initials words, it would mean they are not anagrams.

My algorithms and their runtimes:

Algorithm 1:

    private static boolean isAnagrams1(String x, String y) {
    if (x.length() != y.length()) {
        return false;
    } else if (x.equals(y)) {
        return true;
    }

    Set<Character> anagramSet = new HashSet<>();
    for (int i = 0; i < x.length(); i++) {
        anagramSet.add(x.charAt(i));
        anagramSet.add(y.charAt(i));
    }

    return anagramSet.size() != x.length();
}

This has the runtime of:

Total time: 6914
Average time: 6.914E-4

Algorithm 2

private static boolean isAnagrams2(String x, String y) {
    if (x.length() != y.length()) {
        return false;
    } else if (x.equals(y)) {
        return true;
    }

    Set<Character> anagramSet = new HashSet<>();
    char[] xAr = x.toCharArray();
    char[] yAr = y.toCharArray();
    for (int i = 0; i < xAr.length; i++) {
        anagramSet.add(xAr[i]);
        anagramSet.add(yAr[i]);
    }

    return anagramSet.size() != x.length();
}

Has the runtime of:

Total time: 8752
Average time: 8.752E-4

Algorithm 3

For this algorithm, I decided to send the Set through, therefore I only create it once for every cycle, and clear it after each test.

    private static boolean isAnagrams3(Set<Character> anagramSet, String x, String y) {
    if (x.length() != y.length()) {
        return false;
    } else if (x.equals(y)) {
        return true;
    }

    for (int i = 0; i < x.length(); i++) {
        anagramSet.add(x.charAt(i));
        anagramSet.add(y.charAt(i));
    }

    return anagramSet.size() != x.length();
}

Has the runtime of:

Total time: 8251
Average time: 8.251E-4

Algorithm 4

This algorithm is not mine, it belongs to Pratik Upacharya which answered the question as well, in order for me to compare:

    private static boolean isAnagrams4(String stringOne, String stringTwo) {
    char[] first = stringOne.toLowerCase().toCharArray();
    char[] second = stringTwo.toLowerCase().toCharArray();
    // if length of strings is not same 
    if (first.length != second.length) {
        return false;
    }
    int[] counts = new int[26];
    for (int i = 0; i < first.length; i++) {
        counts[first[i] - 97]++;
        counts[second[i] - 97]--;
    }
    for (int i = 0; i < 26; i++) {
        if (counts[i] != 0) {
            return false;
        }
    }
    return true;
}

Has the runtime of:

Total time: 5707
Average time: 5.707E-4

Of course, these runtimes do differ for every test run, and in order to do proper testing, a larger example set is needed, and maybe more iterations thereof.

*Edited, as I made a mistake in my initial method, Pratik Upacharya's algorithm does seem to be the faster one

Propagandian
  • 444
  • 3
  • 10
1

My solution : Time Complexity = O(n)

public static boolean isAnagram(String str1, String str2) {
    if (str1.length() != str2.length()) {
        return false;
    }

    for (int i = 0; i < str1.length(); i++) {
        char ch = str1.charAt(i);

        if (str2.indexOf(ch) == -1) 
            return false;
        else
            str2 = str2.replaceFirst(String.valueOf(ch), " ");
    }

    return true;
}

Test case :

@Test
public void testIsPernutationTrue() {
    assertTrue(Anagram.isAnagram("abc", "cba"));
    assertTrue(Anagram.isAnagram("geeksforgeeks", "forgeeksgeeks"));
    assertTrue(Anagram.isAnagram("anagram", "margana"));
}

@Test
public void testIsPernutationFalse() {
    assertFalse(Anagram.isAnagram("abc", "caa"));
    assertFalse(Anagram.isAnagram("anagramm", "marganaa"));
}
  • 3
    That is O(n^2), because `str2.indexOf` needs to run over the whole string every time. – Thilo Oct 29 '19 at 09:46
1

Solution using primitive data type.

boolean isAnagram(char input1[], char input2[]) {
    int bitFlip = 32;

    if(input2.length != input1.length){return false;}

    boolean found = false;
    for (int x = 0; x < input1.length; x++) {
        found = false;
        for (int y = 0; y < input2.length; y++) {
             if (!found && ((input1[x] | bitFlip)) ==
             ( (input2[y] | bitFlip))) {
                found = true;
                input2[y] = 0;
            }
        }
        if (!found) {
            break;
        }
    }
    return found ;
}

This approach doesn't rely on any sorting utility. What it does is it's finding the value via iteration and after it found it, it sets it to zero to avoid input with duplicate character like "pool" and "loop" which has a 2 letter "o".

It also ignores cases without relying to toLowerCase() by flipping the bit, because if the 6th bit (32 in decimal) is one, it's a small letter and capital if it's zero.

It's direct byte manipulation so it will perform better like what's used in image manipulation. Maybe the downside is the O(n^2).

This is solution is tested in hackerrank

d12ei
  • 15
  • 4
0
//here best solution for an anagram
import java.util.*;

class Anagram{
public static void main(String arg[]){

Scanner sc =new Scanner(System.in);
String str1=sc.nextLine();
String str2=sc.nextLine();
int i,j;

boolean Flag=true;
i=str1.length();
j=str2.length();


if(i==j){
for(int m=0;m<i;m++){
    for(int n=0;n<i;n++){
        if(str1.charAt(m)==str2.charAt(n)){
           Flag=true;
           break;
          }
          else
          Flag=false;
    }
}
}
else{
Flag=false;
}

if(Flag)
System.out.println("String is Anagram");
else
System.out.println("String is not Anagram");
}
}
  • 2
    I wouldn't call an algorithm "best" when it accepts two Strings which neither are anagrams, nor permutations of each other as "String is Anagram". – Tom Dec 30 '17 at 06:22
0

A recruiter asked me to solve this problem recently. In studying the problem I came up with a solution that solves two types of anagram issues.

issue 1: Determine if an anagram exists within a body of text.

issue 2: Determine if a formal anagram exist within a body of text. In this case the anagram must be of the same size as the text you are comparing it against. In the former case, the two texts need not be the same size.
One just needs to contain the other.

My approach was as follows:

setup phase: First create an anagram Class. This will just convert the text to a Map whose with key the character in question and the value contains the number of occurrences of the input character. I assume that at most this would require O(n) time complexity. And since this would require two maps at most, worst case complexity would be O(2n). At least my naive understanding of Asymptotic notations says that.

processing phase: All you need do is loop thru the smaller of the two Maps and look it up in the larger Map. If it does not exist or if it exists but with a different occurrence count, it fails the test to be an anagram.

Here is the loop that determines if we have an anagram or not:

    boolean looking = true;
        for (Anagram ele : smaller.values()) {
            Anagram you = larger.get(ele);
                if (you == null || you.getCount() != ele.getCount()) {
                    looking = false;
                    break;
                }
        }
        return looking;

Note that I create a ADT to contain the strings being processed. They are converted to a Map first.

Here is a snippet of the code to create the Anagram Object:

    private void init(String teststring2) {
        StringBuilder sb = new StringBuilder(teststring2);
        for (int i = 0; i &lt sb.length(); i++) {
            Anagram a = new AnagramImpl(sb.charAt(i));
            Anagram tmp = map.putIfAbsent(a, a);
            if (tmp != null) {
                tmp.updateCount();
            }
        }
    }
Eric Aya
  • 69,473
  • 35
  • 181
  • 253
0

I came up with a solution and I am not even using any 26 char array... Check this out:

StringBuffer a = new StringBuffer();
        a.append(sc.next().toLowerCase());

        StringBuffer b = new StringBuffer();
        b.append(sc.next().toLowerCase());
        if(a.length() !=b.length())
        {
            System.out.println("NO");
            continue;
        }
        int o =0;
        for(int i =0;i<a.length();i++)
        {
            if(a.indexOf(String.valueOf(b.charAt(i)))<0)
            {
               System.out.println("NO");
               o=1;break; 

            }
        }
        if(o==0)
         System.out.println("Yes");
Ritveak
  • 2,930
  • 2
  • 13
  • 28
0

Consider using HashMap and Arrays.sort

    private static Map<String, String> getAnagrams(String[] data) {

    Map<String, String> anagrams = new HashMap<>();
    Map<String, String> results = new HashMap<>();

    for (int i = 0; i < data.length; i++) {

        char[] chars = data[i].toLowerCase().toCharArray();
        Arrays.sort(chars);

        String sorted = String.copyValueOf(chars);

        String item = anagrams.get(sorted);
        if (item != null) {
            anagrams.put(sorted, item + ", " + i);
            results.put(sorted, anagrams.get(sorted));
        } else {
            anagrams.put(sorted, String.valueOf(i));
        }
    }

    return results;
}

I like it as you only traverse array only once.

mherBaghinyan
  • 476
  • 5
  • 13
0

Simple kotlin solution

fun IsAnagram(s1: String, s2: String): Boolean {
    return s1.groupBy { it } == s2.groupBy { it }
}

Asymptotic time complexity of GroupBy is O(n), the time complexity of above is O(n)

murali kurapati
  • 1,510
  • 18
  • 23
0

Possible solution with Java 8 syntax:

public static boolean isAnagram() {
    String s1 = "no";
    String s2 = "on";

    char[] first = s1.toCharArray();
    char[] second = s2.toCharArray();      

    if (first.length == second.length) {
        Map<Character, Integer> charMap = new HashMap<>();
        for (int i = 0; i < first.length; i++) {
            charMap.put(first[i], charMap.getOrDefault(first[i], 0) + 1);
            charMap.put(second[i], charMap.getOrDefault(second[i], 0) - 1);
        }
        return !charMap.values().stream().anyMatch(e -> e != 0);
    }
    return false;
}
Jimesh Shah
  • 113
  • 10