Check how many character need to be deleted to make an anagram in Python

Question

I wrote python code to check how many characters need to be deleted from two strings for them to become anagrams of each other.

This is the problem statement "Given two strings, and , that may or may not be of the same length, determine the minimum number of character deletions required to make and anagrams. Any characters can be deleted from either of the strings"

def makeAnagram(a, b):
    # Write your code here
    ac=0 # tocount the no of occurences of chracter in a
    bc=0    # tocount the no of occurences of chracter in b
    p=False     #used to store result of whether an element is in that string
    c=0        #count of characters to be deleted to make these two strings anagrams
    t=[]        # list of previously checked chracters
    
    for x in a:
        if x in t == True:
            continue
        ac=a.count(x)
        t.insert(0,x)
        for y in b:
            p = x in b
            if p==True:
                bc=b.count(x)
                if bc!=ac:
                    d=ac-bc
                    c=c+abs(d)

            elif p==False:
                c=c+1 
                               
    return(c)

Will Da Silva · Accepted Answer · 2021-06-19T16:37:43.590

You can use collections.Counter for this:

from collections import Counter

def makeAnagram(a, b):
    return sum((Counter(a) - Counter(b) | Counter(b) - Counter(a)).values())

Counter(x) (where x is a string) returns a dictionary that maps characters to how many times they appear in the string.

Counter(a) - Counter(b) gives you a dictionary that maps characters which are overabundant in b to how many times they appear in b more than the number of times they appear in a.

Counter(b) - Counter(a) is like above, but for characters which are overabundant in a.

The | merges the two resulting counters. We then take the values of this, and sum them to get the total number of characters which are overrepresented in either string. This is equivalent to the minimum number of characters that need to be deleted to form an anagram.

As for why your code doesn't work, I can't pin down any one problem with it. To obtain the code below, all I did was some simplification (e.g. removing unnecessary variables, looping over a and b together, removing == True and == False, replacing t with a set, giving variables descriptive names, etc.), and the code began working. Here is that simplified working code:

def makeAnagram(a, b):
    c = 0 # count of characters to be deleted to make these two strings anagrams
    seen = set() # set of previously checked characters
    for character in a + b:
        if character not in seen:
            seen.add(character)
            c += abs(a.count(character) - b.count(character))
    return c

I recommend you make it a point to learn how to write simple/short code. It may not seem important compared to actually tackling the algorithms and getting results. It may seem like cleanup or styling work. But it pays off enormously. Bug are harder to introduce in simple code, and easier to spot. Oftentimes simple code will be more performant than equivalent complex code too, either because the programmer was able to more easily see ways to improve it, or because the more performant approach just arose naturally from the cleaner code.

Thank you for the answer! can u please explain what i did wrong there? — user9262680, Jun 19 '21 at 08:16
@user9262680 I've updated the answer with a simplified version of your code that works. Please compare it to your code to try and understand the difference. See if you can obtain the simplified code through a step-by-step process of simplifying your code with small changes - that would be good practice. Hope this helps. — Will Da Silva, Jun 19 '21 at 16:39

score 0 · Answer 2 · answered Jun 19 '21 at 06:11

Assuming there are only lowercase letters

The idea is to make character count arrays for both the strings and store frequency of each character. Now iterate the count arrays of both strings and difference in frequency of any character abs(count1[str1[i]-‘a’] – count2[str2[i]-‘a’]) in both the strings is the number of character to be removed in either string.

CHARS = 26
 
# function to calculate minimum
# numbers of characters
# to be removed to make two
# strings anagram
def remAnagram(str1, str2):
 
    
    count1 = [0]*CHARS
    count2 = [0]*CHARS
 
    i = 0
    while i < len(str1):
        count1[ord(str1[i])-ord('a')] += 1
        i += 1
 
    i =0
    while i < len(str2):
        count2[ord(str2[i])-ord('a')] += 1
        i += 1
 
    # traverse count arrays to find
    # number of characters
    # to be removed
    result = 0
    for i in range(26):
        result += abs(count1[i] - count2[i])
    return result

Here time complexity is O(n + m) where n and m are the length of the two strings Space complexity is O(1) as we use only array of size 26

This can be further optimised by just using a single array for taking the count.

In this case for string s1 -> we increment the counter for string s2 -> we decrement the counter

def makeAnagram(a, b):
    buffer = [0] * 26
    for char in a:
        buffer[ord(char) - ord('a')] += 1
    for char in b:
        buffer[ord(char) - ord('a')] -= 1
    return sum(map(abs, buffer))

if __name__ == "__main__" :
 
    str1 = "bcadeh"
    str2 = "hea"
    print(makeAnagram(str1, str2))

Output : 3

Check how many character need to be deleted to make an anagram in Python

2 Answers2