2

My current solution uses a multi-dimensional array, does a simpler solution exist? I want to access the hashed objects in O(1) time and want to make best use of memory space hence the need for a perfect hash.

public final class PerfectHash {

private Object[][][] hashtable = new Object[26][26][26];

public void storeObjectAgainst3letterStringKey(Object o, String s){

    int[] coord = stringToCoord(s);
    hashtable[coord[0]][coord[1]][coord[2]] = o;

}

public Object get(String s){
    int[] coord = stringToCoord(s);
    return hashtable[coord[0]][coord[1]][coord[2]];
}

private int[] stringToCoord(String s){
    if (!s.matches("[a-z][a-z][a-z]")){
        throw new IllegalStateException("invalid input, expecting 3 alphabet letters");
    }
    // 1-26
    // 1-26
    // 1-26
    String lowercase = s.toLowerCase();

    // 97-122 integers for lower case ascii
    int[] coord = new int[3];
    for (int i=0;i<lowercase.length();++i){
        int ascii = (int)lowercase.charAt(i);
        int alpha = ascii - 97; // 0-25     
        coord[i] = alpha;
    }
    return coord;
}
}
newlogic
  • 807
  • 8
  • 25

3 Answers3

3

You don't even need to convert the String first. If your three characters are lower case, you can do this.

public static int hashFor(String s) {
    assert s.length() == 3 && isLower(s.charAt(0)) && isLower(s.charAt(1)) && isLower(s.charAt(2));

    return ((s.charAt(0) - 'a') * 26 + s.charAt(1) - 'a') * 26 + s.charAt(2) - 'a';
}

// check a-z not all lowercase letters.
public static boolean isLower(char ch) {
    return ch >= 'a' && ch <= 'z';
}

a slightly more optimise version is

public static int hashFor(String s) {
    return s.charAt(0) * (26 * 26) + s.charAt(1) * 26 + s.charAt(2) - ('a' * (26*26+26+1));
}

The calculations with only numbers will be optimised by the compiler.

BTW Using matches() is likely to be 100x slower than everything else. ;)

You don't need to convert to lower case if you have already determined it has to be in lowercase.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
2

You could just use a single dimensional array instead of a 3 dimensional array.

Then add a function

public Object get(String s){
    int[] coord = stringToCoord(s);
    int hashindex = (coord[0]*26 + coord[1])*26 + coord[2];
    return hashtable[hashindex];
}

Also, look into trie data structures, they are useful for efficient string look-up.

CaptainCodeman
  • 1,951
  • 2
  • 20
  • 33
  • I'm not sure how this works, how does (coord[0]*26 + coord[1])*26 + coord[2]; map the coord values to a perfect hash value in a single array? it looks like there will be spaces between the hash indices? – newlogic Jun 26 '14 at 10:39
  • If you have a 3-digit number 857 it is equal to (8*10 + 5)*10 + 7. In the same manner, we are mapping 3-letter strings to numbers, but in base 26 instead of base 10. – CaptainCodeman Jun 26 '14 at 10:44
  • 1
    I see that's pretty neat! – newlogic Jun 26 '14 at 11:08
1

The only thing which might be more efficient, is directly mapping your strings to a single hash value and doing lookup in a one-dimensional array:

public final class PerfectHash {
  private Object[] hashtable = new Object[26*26*26];
  private int getHash(String s) {
      char a = s.charAt(0) - 'a', b = s.charAt(1) - 'a', c = s.charAt(2) - 'a';
      if(s.length() != 3 || a >= 26 || b >= 26 || c >= 26)
        throw new IllegalStateException("invalid input, expecting 3 alphabet letters");
      return (a*26+b)*26+c;
  }
  public object get(String s) {return hashtable[getHash(s)];}
  public void set(String s, Object o) {hashtable[getHash(s)] = o;}
}
Deduplicator
  • 44,692
  • 7
  • 66
  • 118