4

I've been developing an algorithm to calculate anagrams for a given (set of) word(s). I just got it to work, with ONE incredibly frustrating exception (no pun intended; there is no actual exception being thrown). Despite my attempts to utilize effective "pruning" to decrease the number of repititions, my algorithm is adding duplicate to the master list, in this case an object of type final static ArrayList(StringBuilder)(). I can't seem to figure out why this is happening. Below is my code; i decided to post the entire method for convenience.

This is an assignment for school, so rather then a straight answer/solution, I'm looking for guidance/conceptual mistakes on my end.

EDIT: (code edited out to avoid possible plagiarism prior to the assignment's due date.)

Here is an example:

**input:**
pnxish
bauelqbs
coxiuqit
elbarcbs
ptos

**output:**
Now printing anagrams: 
Anagram #0: sphinx
Anagram #1: squabble
Anagram #2: squabble
Anagram #3: quixotic
Anagram #4: quixotic
Anagram #5: scrabble
Anagram #6: scrabble
Anagram #7: pots
Anagram #8: post
Anagram #9: tops
Anagram #10: opts
Anagram #11: spot
Anagram #12: stop

Thanks for the help! :)

insomniac
  • 131
  • 14
  • 2
    Can you give us example with input, output and expected output? – Smit May 24 '13 at 22:49
  • it is customary to write i < n, instead of i <= n-1 – tucuxi May 24 '13 at 22:50
  • 1
    Why do you add StringBuilders to your anagramList instead of Strings? – tucuxi May 24 '13 at 22:52
  • Yes, I just edited in an example of i/o values. Thank you and sorry to exclude that important detail! – insomniac May 24 '13 at 22:52
  • I am using StringBuilders due to them being mutable; another reason is that the professor requires us to use them. – insomniac May 24 '13 at 22:53
  • This being an Algorithms course, the Professor explained that less operations are performed if we use StringBuilders, due to new objects not having to be created each time any modifications are made to the string. This increases runtime efficiency (or so we were told). – insomniac May 24 '13 at 23:00
  • @insomniac: I'm thinking you misunderstood him. No decent teacher, particularly not one as overly worried about efficiency as that, will recommend using them *everywhere*. For one because in a number of ways, they don't act like Strings...but also because where you don't *need* that mutability, you don't *want* it. Immutability provides a bunch of advantages of its own, like not needing three objects to represent the same value three times. – cHao May 24 '13 at 23:41

5 Answers5

4

The obvious algorithm (just swapping letters) is a bit naive, and doesn't consider identical letters as instances of the same letter. For example, If you have a word like "eve", the two "e"s are distinct; if we bold the first E for purpose of illustration, you get combinations like "e v e" and "e v e" at various points in the process.

You need to somehow eliminate the duplicates. The easiest way to do so is to stuff the combos into a Set of some type (like a HashSet). It can only contain one of each item, so the duplicates will effectively be discarded.

Oh, and use Strings, not StringBuilders. I just noticed you were doing that. StringBuilder doesn't override equals, so you're left with the version inherited from Object. End result: for two StringBuilders a and b, a.equals(b) only if a == b.

cHao
  • 84,970
  • 20
  • 145
  • 172
3

One easy solution is to use a Set for storing your anagrams. This will take care of the duplicate values.

My guess is you're using a list, since your variable is named anagramList. You can find the JavaDoc for Set here: http://docs.oracle.com/javase/6/docs/api/java/util/Set.html

ktm5124
  • 11,861
  • 21
  • 74
  • 119
  • Hello and thank you for your response! As you can see in my code, prior to calling my ArrayList's add() method, I do a check to verify that the value has in fact NOT been added to the list. So I am still unsure as to how exactly duplicates are making it in to the list if I am performing a check. Thanks! – insomniac May 24 '13 at 22:57
2

I would look to use a Set to store the anagrams, but use a String rather than StringBuilder, i.e.

Set<String> anagrams = new HashSet<String>();

The reason for not using StringBuilder is that the hashCode does not change when changing it, as given in this example:

StringBuilder sb = new StringBuilder();
System.out.println(sb.hashCode());
sb.append('c');
System.out.println(sb.hashCode());

This will output the same hash code, this means that the hash code of a StringBuilder is not reliable a comparator for its content.

ashatch
  • 310
  • 1
  • 10
1

This is the problem you're running into in your code. If there is a StringBuilder object in your list for "squabble", the contains method will return false when you check if your list contains a different StringBuilder object for "squabble" after you build "squabble" again (which happens because of the letter b occurring twice).

The contains is checking if the object is in the list, not if there is an object representing the same string.

0

You can't use the contains() method to check whether the string content itself is in there:

List<StringBuilder> list = new ArrayList<StringBuilder>();      
StringBuilder sb = new StringBuilder("hello");
list.add(sb);
StringBuilder sb2 = new StringBuilder("hello");
System.out.println(list.contains(sb2)); 
//echos false
ashatch
  • 310
  • 1
  • 10