0

So I know generating a password in the following way is a bad idea. I'd say it has only a few (like maybe 5 or so) bits of entropy, but I'm unable to calculate it properly.

Can someone show me, how to calculate the average amount of tries needed to guess a password of length n generated in the following way using Oracle's JDK 7?

I assume the relevant factors are:

  • alphabet size (62 - 5 for restricting confusing-looking characters),
  • two step process to select character class and then character,
  • rounding to integer,
  • try-until-succeed way of sampling the characters,
  • intrinsic properties of Math.random().

But I can't get the exact numbers.

char[] generate(int n) {
    char[] pw = new char[n];
    for (int i = 0; i < n; i++) {
        int c;
        while (true) {
            c = randomCharacter(c);
            if (c == '0' || c == 'O' || c == 'I' || c == '1' || c == 'l') 
                continue;
             else 
                break;
        }
        pw[i] = (char) c;
    }
    return pw;
}

int randomCharacter(int c) { 
    switch ((int) (Math.random() * 3)) {
    case 0:
        c = '0' + (int) (Math.random() * 10);
        break;
    case 1:
        c = 'a' + (int) (Math.random() * 26);
        break;
    case 2:
        c = 'A' + (int) (Math.random() * 26);
        break;
    }
    return c;
}
Jakub Bochenski
  • 3,113
  • 4
  • 33
  • 61

1 Answers1

1

Assuming that Math.random() is unpredictable, for the randomCharacter function, the probability of a specific digit being returned is 1/3 * 1/10, and of a letter, 1/3 * 1/52.

For entries in the pw array, some characters are invalid, so the remaining characters' probability becomes higher. You need to rescale the probabilities so that the sum again becomes 1, i.e., divide by the sum of the remaining probabilities. The result is that a digit has the probability 1/3 * 1/10 / (8 * 1/3 * 1/10 + 47 * 1/3 * 1/52), and a letter, 1/3 * 1/52 / (8 * 1/3 * 1/10 + 47 * 1/3 * 1/52).

Plugging all these values in the formula for the Shannon entropy gives the result of about 5.7 bits of entropy per character.

If you would use a single array of the 57 valid characters, and use a single random number to index it, you would get an entropy of about 5.8 bits per character.

CL.
  • 173,858
  • 17
  • 217
  • 259
  • Does it still hold if Math.random() is only a PRNG, and not ideally unpredictable? I thought that generally for PRNG invoking it many times to get a single character would actually reduce entropy – Jakub Bochenski Mar 07 '14 at 15:00
  • If the PRNG is predictable, an attacker will be able to predict your password, so in this case the entropy is zero. – CL. Mar 07 '14 at 16:29
  • I mean that with a PRNG output is not fully independent from previous values. In effect e.g. the second letter distribution is dependent on the first letter (thats why you would usually want to use a crypto PRNG). This obviously reduces entropy, but how to quantify it? – Jakub Bochenski Mar 07 '14 at 19:29
  • That depends on 1) how you quantify the entropy of the PRNG, and 2) how much of this affects the letter choice (because you take a random number between 0 and 1, but end up using only the information whether it is, e.g., in the intervals 0..1/3, 1/3..2/3, 2/3..1). – CL. Mar 07 '14 at 20:57
  • Yes, but that's exactly the part that I have trouble with, any pointers? – Jakub Bochenski Mar 08 '14 at 00:48