6

I'm trying to develop a reduction function for use within a rainbow table generator.

The basic principle behind a reduction function is that it takes in a hash, performs some calculations, and returns a string of a certain length.

At the moment I'm using SHA1 hashes, and I need to return a string with a length of three. I need the string to be made up on any three random characters from:

abcdefghijklmnopqrstuvwxyz0123456789

The major problem I'm facing is that any reduction function I write, always returns strings that have already been generated. And a good reduction function will only return duplicate strings rarely.

Could anyone suggest any ideas on a way of accomplishing this? Or any suggestions at all on hash to string manipulation would be great.

Thanks in advance

Josh

Joshua Craven
  • 179
  • 1
  • 6

2 Answers2

6

So it sounds like you've got 20 digits of base 255 (the length of a SHA1 hash) that you need to map into three digits of base 36. I would simply make a BigInteger from the hash bytes, modulus 36^3, and return the string in base 36.

public static final BigInteger N36POW3 = new BigInteger(""+36*36*36));
public static String threeDigitBase36(byte[] bs) {
  return new BigInteger(bs).mod(N36POW3).toString(36);
}
// ...
threeDigitBase36(sha1("foo")); // => "96b"
threeDigitBase36(sha1("bar")); // => "y4t"
threeDigitBase36(sha1("bas")); // => "p55"
threeDigitBase36(sha1("zip")); // => "ej8"

Of course there will be collisions, as when you map any space into a smaller one, but the entropy should be better than something even sillier than the above solution.

maerics
  • 151,642
  • 46
  • 269
  • 291
  • My bad. I'll delete that answer. That's a property related with the existence of the inverse in a congruence relationship (something you don't need in this case). – Gabriel Belingueres Feb 20 '12 at 08:57
  • Thank you very much, they are both super answers, but I've accepted the one from Bohemian as it's slightly shorter. Thanks for all the comments though; once again the people on StackOverflow prove themselves to not be getting paid enough, whatever it is they are doing!!! – Joshua Craven Feb 20 '12 at 18:52
  • 1
    @JoshuaCraven: very welcome! Note also that both my answer and Bohemian's might return a string of length 1, 2 or 3, so you'll want to pad the returned string with zeros if the length is less than 3. – maerics Feb 20 '12 at 20:17
4

Applying the KISS principle:

  • An SHA is just a String
  • The JDK hashcode for String is "random enough"
  • Integer can render in any base

This single line of code does it:

public static String shortHash(String sha) {
    return Integer.toString(sha.hashCode() & 0x7FFFFFFF, 36).substring(0, 3);
}

Note: The & 0x7FFFFFFF is to zero the sign bit (hash codes can be negative numbers, which would otherwise render with a leading minus sign).

Edit - Guaranteeing hash length

My original solution was naive - it didn't deal with the case when the int hash is less than 100 (base 36) - meaning it would print less than 3 chars. This code fixes that, while still keeping the value "random". It also avoids the substring() call, so performance should be better.

static int min = Integer.parseInt("100", 36);
static int range = Integer.parseInt("zzz", 36) - min;

public static String shortHash(String sha) {
    return Integer.toString(min + (sha.hashCode() & 0x7FFFFFFF) % range, 36);
}

This code guarantees the final hash has 3 characters by forcing it to be between 100 and zzz - the lowest and highest 3-char hash in base 36, while still making it "random".

Community
  • 1
  • 1
Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • Sorry to resurrect an other wise answered question. However, if I change the substring on your answer for (0, 3) to (0, 4), the number of substrings generated under 4 characters in length becomes quite an issue. Also the function seems to run extremely slowly. Could you offer an explanation as to why that is? – Joshua Craven Feb 22 '12 at 15:08
  • @JoshuaCraven OK - I've added more code that addresses your concerns. You raised good points too btw. – Bohemian Feb 23 '12 at 01:48
  • Top stuff. Thanks very much for your help!! – Tony Feb 25 '12 at 14:31