16

Going over some presentation, I've come across the following claim: When the JVM loads a class, it can analyze its content and make sure there's no overflow or underflow of the operand stack. I've found a lot of sources that make the same claim, but without specifying how it's done.

It is unclear to me how such verification can be made using static analysis. Say I have a (malicious) method that gets some value as an argument, and uses it to perform a series of pops. At load time, the number of iterations is not known, as it depends on the argument given by the method's caller. Therefore, it seems to me that only at runtime should it be possible to determined whether there will be an underflow or not. What am I missing here?

Eran
  • 21,632
  • 6
  • 56
  • 89
  • The validator may reject any attempt to pop in a loop. – Marko Topolnik May 10 '12 at 20:29
  • @MarkoTopolnik, I used `pop` as the clearest example. Other popping commands might be used as well, such as the various `store`s. – Eran May 10 '12 at 20:32
  • OK, so do you see any legitimate case where code would have an excess of pops (by whatever instruction) relative to pushes in a loop step? – Marko Topolnik May 10 '12 at 20:35
  • @MarkoTopolnik, definitively not, and I assume this can't be achieved by compiling Java. But the verify is there to protect against buggy, malicious or corrupted classes. As I understand it, the designers do assume offensive classes might be loaded, hence they've added the verification. – Eran May 10 '12 at 20:56
  • My point was to indicate a pattern that is both feasible for the validator to detect and covers all cases that you mentioned in the question, while not resulting in any false positives. – Marko Topolnik May 10 '12 at 21:04

2 Answers2

11

You can find basic description of the Bytecode Verifier in Java Virtual Machine specification.

To put it simple, stack depth is known at every branching point, and two execution paths merging at the same merge point must also have the same stack depth. So, the verifier won't allow you to perform series of pops without corresponding puts.

Eugene Kuleshov
  • 31,461
  • 5
  • 66
  • 67
  • Thanks for the link (and the simple explanation). I wonder what's the effect of this limitation on the efficiency of the bytecode at runtime (e.g. can't have a loop that pushes stuff and then a loop that uses that stuff while popping it). But that's way beyond the scope of this question. – Eran May 11 '12 at 07:37
  • The only issue I know is the tail-recursion optimization. But it is solvable at a method level, so same verification rules apply. If I am not mistaken work in JVM is already being done to support it. It does not affect the bytecode, but JIT compiler that translates it to native code. – Eugene Kuleshov May 11 '12 at 13:52
  • How do they solve it when the execution jumps to a catch block when throwing an exception? When an exception is caught the operand stack will be cleared and the Instance of the Exception will be pushed to it. – neoexpert May 11 '20 at 19:39
6

The code of the method is split into execution blocks. A "block" is a sequence of instructions that can be executed without jumping out or into. The blocks build a directed graph of all possible execution paths.

A block always expects a certain stack size at its beginning and has a fixed stack size at its end (beginning + all the pushes - all the pops). The verifier checks that for all blocks 'a' that can be reached from a given block 'b', the end stack-size of b matches the beginning stack-size of a.

Cephalopod
  • 14,632
  • 7
  • 51
  • 70