14

To prevent timing attacks, a constant time equals is sometimes needed. There's MessageDigest.isEqual not documented to be a constant time method and guava HashCode.equals and others. All of them do something like

boolean areEqual = true;
for (int i = 0; i < this.bytes.length; i++) {
    areEqual &= (this.bytes[i] == that.getBytesInternal()[i]);
}
return areEqual;

or

    int result = 0;
    for (int i = 0; i < digesta.length; i++) {
        result |= digesta[i] ^ digestb[i];
    }
    return result == 0;

but who says that the JIT can't introduce a short circuit when optimizing?

It's not that hard to find out e.g., that areEqual will never again become true and break the loop.


I gave it a try on CR by computing a value depending on all input bits and feeding it to a home-made Blackhole.

Community
  • 1
  • 1
maaartinus
  • 44,714
  • 32
  • 161
  • 320

5 Answers5

11

You cannot know the future

You fundamentally cannot predict what future optimisers might or might not do in any language.

To look at the future, the best odds are for the OS itself to provide timing constant tests, that way they can be properly tested and used in all environments.

This has been ongoing for quite some time already. E.g. The timingsafe_bcmp() function in libc first appeared in OpenBSD 4.9. (Released in May 2011).

Obviously programming environments need to pick these up and/or provide their own functions that they guarantee will not be optimised away.

Inspect assembly code

There is some discussion of optimisers here. It's C (and C++) minded, but it's really language independent that you can only look at what current optimisers can do, not what future optimisers might do. Anyway they rightfully recommend to check the assembly code to learn what your optimiser does.

For java that's not necessarily as "easy" as for C or C++ given it's nature, but it should not be impossible for specific security functions to actually do that effort for current environments.

Avoidance might be possible

You could try to avoid the timing attack.

E.g.:

Although intuitively the addition of random time might seem the ting to do, it won't work: the attackers is already using statistical analysis in timing attacks, you merely add some more noise.

https://security.stackexchange.com/questions/96489/can-i-prevent-timing-attacks-with-random-delays

Still: it does not mean that you cannot make a time constant implementation if your application can be slow enough. i.e.: wait long enough. E.g. you could wait for a timer to go off and only then continue processing the result of the comparison avoiding the timing attack anyway.

Detection

It should be possible to write detection of timing attack vulnerability into applications using an implementation of timing constant comparisons.

Ether:

  • some test that is run during initialisation
  • the same test regularly as a part of normal operations.

Again the optimiser is going to be tricky to deal with as it can (and sometimes will) even change the order of execution of things. But e.g. using inputs that the program does not have in its code (e.g. an external file), and running it twice: once with a normal compare and identical strings, once with completely different strings (xored e.g.) and then again with those inputs but with a constant time compare. You now have 4 timings: the normal compare should not be the same, the constant time compare should be slower and the same. If it fails: warn the user/maintainer of the application the constant time stuff is likely broken in production usage.

  • A theoretical option is to collect actual timings yourself (record fail/success as well) and statistically analyse them yourself. But it would be tricky to perform in practice as your measurements would need to be extremely accurate as you cannot loop it a few million times, you're dealing with measuring just one comparison and won't have the resolution to measure it accurately enough ... .
Community
  • 1
  • 1
5

JIT is not only allowed to do such optimizations, but it actually does so sometimes.

Here is a sample bug I've found in JMH, where a short circuit optimization resulted in unstable benchmark scores. JIT has optimized the evaluation of (bool == bool1 & bool == bool2), despite that & was used rather than &&, even when bool1 and bool2 were declared volatile.

JIT gives no guarantees on what it does optimize and what it does not. Even if you verify that it works as you want, future JVM versions may break these assumptions. Ideally there should be intrinsified methods in core JDK libraries for such important security primitives.

You may try to avoid undesired optimizations by certain techniques, e.g.

  • involve volatile fields;
  • apply incremental accumulation;
  • produce side effects, for example, write to shared memory etc.

But they are also not 100% bullet-proof, so you have to verify the generated assembly code and rewview it after each major Java update.

apangin
  • 92,924
  • 10
  • 193
  • 247
1

Indeed you cannot predict what an optimizer will do. Nevertheless, in this case you could reasonably do the following:

  1. Compute the exclusive-OR of the values being compared. The time taken depends on the length of the values only.
  2. Compute a hash of the resulting bytes. Use a hash function which returns a single integer.
  3. Exclusive-OR this hash with the precomputed hash of an equal-length sequence of zeroes.

I think it is a pretty safe bet that hash functions are not something that is going to be optimized away. And an exclusive-OR between integers is the same speed irrespective of the result.

nugae
  • 499
  • 2
  • 5
  • 2
    Returning an `int` would lead to collisions with a probability of `2**-32`, i.e. `2.3e-10`, which is simply too much for anything security-sensitive. You'd reduce guessing the secret (chance like `2**-256`, i.e. impossible) to guessing a collision. +++ Using `long` would be better (and also constant time on a 64-bit JVM), but still not good enough. – maaartinus Jun 14 '16 at 00:55
1

You can and should use java.security.MessageDigest.isEqual(byte[], byte[]),it is (at least in later Java versions) documented to be used for this:

* @implNote
* All bytes in {@code digesta} are examined to determine equality.
* The calculation time depends only on the length of {@code digesta}.
* It does not depend on the length of {@code digestb} or the contents
* of {@code digesta} and {@code digestb}.
eckes
  • 10,103
  • 1
  • 59
  • 71
0

One could prevent the optimization as such:

int res = 0;
for (int i = 0; i < this.bytes.length; i++) {
    res |= this.bytes[i] ^ that.getBytesInternal()[i];
}
Logger.getLogger(...).log(FINEST, "{0}", res);
return res == 0;

But on the original code:

With the old code one maybe should disassemble with javap to see that no optimization was made. For another java compiler (like java 9) one would need to repeat that.

JIT kicks in late, but then you are right, optimizing could happen: it would need an extra test in the loop (which in itself slows every cycle down).

So you are right. One may only hope that the effect is negligable in the entire measurement. And some other safe guard helps, if only a random delay on failue, inequality, which always is a nice stumble block.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • How would a log statement outside the prevent the short circuit? – Sleiman Jneidi Jun 04 '16 at 19:32
  • @SleimanJneidi As the result `res` is logged, it must be calculated to the end, so the loop may not be short-circuited. – Joop Eggen Jun 04 '16 at 21:02
  • ... unless you get `res == -1` at some point and then you can exit the loop. But this would be a different test and I can't imagine a compiler doing this. – maaartinus Jun 05 '16 at 13:19
  • I don't think you need the logging. Either the JIT can short-circuit the loop, in which it can just use the short-circuited value in the log line as well as in the return; or it can't, in which case the return is good enough. Either way, this boils down to trying to out-smart the JIT by presenting it with logic that we know is equivalent to an &, but which we think the JIT won't recognize. (Which isn't a bad approach!) – yshavit Jun 06 '16 at 05:00
  • @yshavit a short-circut would be an faulty optimisation, a wrong value passed to the logger. I doubt that such a thing would be done. But this is becoming thin ice. – Joop Eggen Jun 06 '16 at 17:28
  • 1
    If the result is unequal, `res` will converge in just over log2 of the number bits (about 4). So there is some kind of potential optimisation available there maybe. / For 32-bit ARM I believe a sufficiently advanced optimiser could invert `res`, use `BICS` (bitwise and of inverted rhs with setting flags) and make use of conditional instructions to do an optimisation without increasing the inner loop instruction count. – Tom Hawtin - tackline Jun 09 '16 at 08:36
  • @TomHawtin-tackline a very good counter argument, `res` becoming 0b111...111. And of course the good ol' assembly. – Joop Eggen Jun 09 '16 at 09:10