Finding number of anagrams

Question

Following was a question that was asked to me in one of the interviews. We know anagram of eat is: tea and ate The question is: We have a program. We feed a list of 10 thousand alphabets to this program. We run the program. Now at run-time, we provide a word to this program eg. "eat" Now the program should return the number of anagrams that exist in the list of 10 thousand alphabets. Hence for an input of "eat", it should return 2.

What will be the strategy to store those 10 thousand alphabets so that finding the number of anagrams becomes easy.

You mean 10000 words or letters? There is only 1 alphabet in english that runs from the letter a to z — Abhishek Bansal, Oct 17 '13 at 07:59
What did you try? Please elaborate one or 2 DS you thought of, and why do you think they are not good enough. — amit, Oct 17 '13 at 07:59
The easier and the faster may be to count the letters of each word and store the couples (array, number of occurrences) in a sorted structure or in a Hashmap. — Traklon, Oct 17 '13 at 08:01
You could do a quite fast algorithm by mapping each of the 26 English characters to a unique prime number. After that you calculate the product of the string. By the fundamental theorem of arithmetic, 2 strings are anagrams if and only if their products are the same. — spydon, Oct 17 '13 at 08:03
Check this post... http://stackoverflow.com/questions/8971688/find-anagram-of-input-on-set-of-strings — Kshitij Banerjee, Oct 17 '13 at 08:25
are those anagrams has to create dictionary words or sentences? otherwise is simple variation without repetition — user902383, Oct 17 '13 at 08:55

Bernhard Barker · Answer 1 · 2013-10-17T08:31:06.857

1

Order the letters of each word as to minimize it's ordering, i.e. tea becomes aet.

Then simply put these in a (hash) map of words to counts (both tea and ate maps to aet, so we'll have (aet, 2) in the map)

Then, when you get a word, reorder the letters as above and do a lookup for the count.

Running time:

Assuming n words in the list, with an average word length of m...

Expected O(nm log m) preprocessing, expected O(m log m) per query.

It's m log m on the assumption we just do a simple sort of the letters of a word.

The time taken per query is expected to be unaffected by the numbers of words in the list (i.e. hash maps give expected O(1) lookup time).

edited Oct 17 '13 at 08:31

answered Oct 17 '13 at 08:12

Bernhard Barker

54,589
14
104
138

1

The length of the words should be taken into account in the complexity, doesn't it ? Given there are only 26 possible letters it can be done in O(k) where k is the length of the word with counting sort, but still. – Traklon Oct 17 '13 at 08:20
I have tried to use the brute force method O(n3) – user1001254 Oct 17 '13 at 14:35
Somehow I am not able to add any other response to this post. Why so? I want to put the resolution that I tried myself in this post. But the post seems to be on hold. Please un-hold it so that I can post my part of the solution. – user1001254 Oct 17 '13 at 14:37
I want to put the resolution that I tried myself in this post. But the post seems to be on hold. Please un-hold it so that I can post my part of the solution. – user1001254 Oct 17 '13 at 14:41
@user1001254 You can edit your question and add the attempted solution to there. – Bernhard Barker Oct 17 '13 at 14:41

Finding number of anagrams

1 Answers1