1

What I want to do is to record the event that a Throwable is thrown out of a method. I wrote the following simple code and didn't use COMPUTE_FRAME and COMPUTE_MAX deliberately to get myself familiar with the concepts of stack map frame, operand stack, and locals. I only insert three stackmap frame by instrumentation: after the tryEnd label, after the catchStart label, and after the catchEnd (In my class MyMethodVisitor, the code is sho).

When I tried my javaagent in the testing process of joda-time, it crashed with the following message:

[ERROR] There was an error in the forked process
[ERROR] Stack map does not match the one at exception handler 9
[ERROR] Exception Details:
[ERROR]   Location:
[ERROR]     org/joda/time/TestAllPackages.<init>(Ljava/lang/String;)V @9: ldc
[ERROR]   Reason:
[ERROR]     Type 'org/joda/time/TestAllPackages' (current frame, locals[0]) is not assignable to uninitializedThis (stack map, locals[0])
[ERROR]   Current Frame:
[ERROR]     bci: @2
[ERROR]     flags: { flagThisUninit }
[ERROR]     locals: { 'org/joda/time/TestAllPackages', 'java/lang/String' }
[ERROR]     stack: { 'java/lang/Throwable' }
[ERROR]   Stackmap Frame:
[ERROR]     bci: @9
[ERROR]     flags: { flagThisUninit }
[ERROR]     locals: { uninitializedThis, 'java/lang/String' }
[ERROR]     stack: { 'java/lang/Throwable' }
[ERROR]   Bytecode:
[ERROR]     0x0000000: 2a2b b700 01b1 a700 0912 57b8 005c bfb1
[ERROR]     0x0000010:                                        
[ERROR]   Exception Handler Table:
[ERROR]     bci [0, 6] => handler: 9
[ERROR]   Stackmap Table:
[ERROR]     same_frame(@6)
[ERROR]     same_locals_1_stack_item_frame(@9,Object[#85])
[ERROR]     same_frame(@15)

Obviously, it must be the problem when I inserting the stackmap frame. But I got confused:

  1. What is the exact meaning and the difference of Current Frame and Stackmap Frame?
  2. Why there is a uninitializedThis in the stackmap frame at @9? To my understanding, a object is always uninitializedThis until the constructor call is finished, am I right?
  3. I think my instrumentation is correct because org/joda/time/TestAllPackages is the type of this. How to avoid the inconsistency between org/joda/time/TestAllPackages and uninitializedThis?

When I looked into the bytecode, it looks like:

public org.joda.time.TestAllPackages(java.lang.String);
  descriptor: (Ljava/lang/String;)V
  flags: ACC_PUBLIC
  Code:
    stack=7, locals=2, args_size=2
       0: aload_0
       1: aload_1
       2: invokespecial #1                  // Method junit/framework/TestCase."<init>":(Ljava/lang/String;)V
       5: return
       6: goto          15
       9: ldc           #87                 // String org/joda/time/TestAllPackages#<init>#(Ljava/lang/String;)V
      11: invokestatic  #92                 // Method MyRecorder.exception_caught:(Ljava/lang/String;)V
      14: athrow
      15: return
    Exception table:
       from    to  target type
           0     6     9   Class java/lang/Throwable
    StackMapTable: number_of_entries = 3
      frame_type = 6 /* same */
      frame_type = 66 /* same_locals_1_stack_item */
        stack = [ class java/lang/Throwable ]
      frame_type = 5 /* same */
    LineNumberTable:
      line 31: 0
      line 32: 5

BTW, my simplified instrumentation code is like:

public class PreMain {
    public static void premain(String args, Instrumentation inst){
        inst.addTransformer(new MyTransformer());
    }
}
public class MyTransformer implements ClassFileTransformer {
    @Override
    public byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined,
                            ProtectionDomain protectionDomain, byte[] classfileBuffer) throws IllegalClassFormatException {
        byte[] result = classfileBuffer;

        try{
            if (className == null || shouldExcludeClass(className)) return result;
            ClassReader cr = new ClassReader(classfileBuffer);
            // I don't use COMPUTE_FRAME and COMPUTE_MAX deliberately
            ClassWriter cw = new ClassWriter(cr, 0);
            ClassVisitor cv = new MyClassVistor(cw, className, loader);
            cr.accept(cv, 0);
            result = cw.toByteArray();
        } catch (Throwable t){
            t.printStackTrace();
        }

        return result;
    }
}
public class MyClassVistor extends ClassVisitor {
    ...
    @Override
    public MethodVisitor visitMethod(int access, String name, String desc, String signature, String[] exceptions) {
        MethodVisitor mv = cv.visitMethod(access, name, desc, signature, exceptions);
        if (!isNative && !isEnum && !isAbstract && !"<clinit>".equals(name)){
            mv = new MyMethodVisitor(mv, name, access, desc, slashClassName, isStatic, isPublic);
        }
        return mv;
    }
}
public class MyMethodVisitor extends MethodVisitor {

    private Label tryStart = new Label();
    private Label tryEnd = new Label();
    private Label catchStart = new Label();
    private Label catchEnd = new Label();

    public void visitCode() {
        mv.visitCode();
        mv.visitTryCatchBlock(tryStart, tryEnd, catchStart, "java/lang/Throwable");
        mv.visitLabel(tryStart);
    }

    @Override
    public void visitEnd() {
        mv.visitLabel(tryEnd);
        mv.visitFrame(F_SAME, 0, null, 0, null); /* This line takes me more than 6 hours to figure out. Why this line can't be omitted? */
        mv.visitJumpInsn(GOTO, catchEnd);

        mv.visitLabel(catchStart);
        // exception caught
        mv.visitFrame(F_SAME1, 0, null, 1, new Object[] {"java/lang/Throwable"}); /* add stackmap frame after jump target */
        mv.visitLdcInsn(this.selfMethodId);
        mv.visitMethodInsn(INVOKESTATIC, MyRecorder.SLASH_CLASS_NAME, MyRecorder.EXCEPTION_CAUGHT,
                "(Ljava/lang/String;)V", false);
        mv.visitInsn(ATHROW);

        mv.visitLabel(catchEnd);
        mv.visitFrame(F_SAME, 0, null, 0, null); /* add stackmap frame after jump target */
        // Make up a return statement
        switch (Type.getReturnType(selfDesc).getSort()){
            case BYTE:
            case CHAR:
            case SHORT:
            case BOOLEAN:
            case INT:
                mv.visitLdcInsn(0);
                mv.visitInsn(IRETURN);
                break;
            case LONG:
                mv.visitLdcInsn(0L);
                mv.visitInsn(LRETURN);
                break;
            case FLOAT:
                mv.visitLdcInsn(0f);
                mv.visitInsn(FRETURN);
                break;
            case DOUBLE:
                mv.visitLdcInsn(0.0);
                mv.visitInsn(DRETURN);
                break;
            case OBJECT:
                mv.visitInsn(ACONST_NULL);
                mv.visitInsn(ARETURN);
                break;
            case VOID:
                mv.visitInsn(RETURN);
                break;
        }
        super.visitEnd();
    }

    @Override
    public void visitMaxs(int maxStack, int maxLocals) {
        // +5 because other logic need more space on operand stack
        super.visitMaxs(maxStack + 5, maxLocals);
    }
}
Instein
  • 2,484
  • 1
  • 9
  • 14

2 Answers2

4

Let’s clean up first

mv.visitFrame(F_SAME, 0, null, 0, null); /* This line takes me more than 6 hours to figure out. Why this line can't be omitted? */
mv.visitJumpInsn(GOTO, catchEnd);

You are creating an exception handler for the entire method, appending the handler after the original code. Assuming that the original code is valid, it must end with a …return, athrow, or goto instruction, as the code is not allowed to “fall off” the end of code.

Therefore, the code you’re appending here, the goto over the handler to a newly generated return instruction is unreachable. Unreachable code always requires a new stack map frame to describe its initial state, as the verifier can’t guess one.

But, of course, instead of providing a frame for the unreachable code, you can just omit this unnecessary code.

So the simplified code looks like

public class MyMethodVisitor extends MethodVisitor {
    private final Label tryStart = new Label();
    private final Label tryEndCatchStart = new Label();

    …

    @Override
    public void visitCode() {
        mv.visitCode();
        mv.visitLabel(tryStart);
    }

    @Override
    public void visitMaxs(int maxStack, int maxLocals) {
        mv.visitTryCatchBlock(
            tryStart, tryEndCatchStart, tryEndCatchStart, "java/lang/Throwable");

        mv.visitLabel(tryEndCatchStart);
        mv.visitFrame(F_FULL, 0, null, 1, new Object[] {"java/lang/Throwable"});
        mv.visitLdcInsn(this.selfMethodId);
        mv.visitMethodInsn(INVOKESTATIC, MyRecorder.SLASH_CLASS_NAME,
            MyRecorder.EXCEPTION_CAUGHT, "(Ljava/lang/String;)V", false);
        mv.visitInsn(ATHROW);
        // the exception handler needs two stack entries, the throwable and a string
        super.visitMaxs(Math.max(2, maxStack), maxLocals);
    }
}

Note: since we don’t know what kind of frames the instrumented code contains (e.g. it might introduce new variables), we should not use a frame type defining the stack state based on the previous frame. The example above simply drops all variables, as the exception handler doesn’t need them anyway, which is compatible to every possible stack state—at least for ordinary methods.

The code above is sufficient to instrument every ordinary method but not constructors. It is impossible to create an exception handler covering the entire constructor, including the super(…) with stack maps. Older class files without stack maps may install such an exception handler, as long as it doesn’t try to return or to use this. But with stack maps, it’s impossible to express the initial state of the handler:

From JVMS §4.10.1.9:

But if the invocation of an <init> method throws an exception, the uninitialized object might be left in a partially initialized state, and needs to be made permanently unusable. This is represented by an exception frame containing the broken object (the new value of the local) and the flagThisUninit flag (the old flag). There is no way to get from an apparently-initialized object bearing the flagThisUninit flag to a properly initialized object, so the object is permanently unusable.

The problem is that we can’t express flags in stack maps. The stack map’s frame only contains types and if UninitializedThis is present, the flag flagThisUninit is assumed to be present, which is suitable to describe the situation before the super constructor invocation. When UninitializedThis is not present, the flag flagThisUninit is assumed to be absent too, which is suitable to describe the situation after the super constructor invocation.

But when the super constructor invocation fails with an exception, the stack state is as described above, with the UninitializedThis already replaced by the new value of the local but the flag flagThisUninit still present. We can’t describe such a frame using stack maps, hence, we can’t describe the initial frame of the exception handler.


So, you can’t cover the super constructor call with you exception handler. You can only install exception handlers for the code before and after the call and you need two distinct handlers, due to the incompatible flag state.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • Thanks for your detailed explanation! I appreciate it! One thing I don't understand is `mv.visitFrame(F_FULL, 0, null, 1, new Object[] {"java/lang/Throwable"});`. What do you mean by "simply drops all variables"? From the [API](https://asm.ow2.io/javadoc/org/objectweb/asm/Opcodes.html#F_FULL), it says F_FULL frame needs to contain complete frame data. But here you seem to delete all the variables in the locals? – Instein Oct 13 '21 at 17:11
  • 3
    Exactly. I specify a complete frame containing no variables. Since the exception handler doesn’t use any, there is no need to guess which variables are there. Generally, frames do not need to describe what has been there in the previous code, but what will be used in the subsequent code. Which, of course, must be compatible, which is what stack map based verification is all about. This differs from the `COMPUTE_FRAME` approach, as ASM will try to calculate a state based on the previous state. [This answer](https://stackoverflow.com/a/49262105/2711488) shows, how different the results can be. – Holger Oct 13 '21 at 17:19
1

Perhaps the analyze method of the org.objectweb.asm.tree.analysis.Analyzer class will provide us a little insight:

if (newControlFlowExceptionEdge(insnIndex, tryCatchBlock)) {
    Frame<V> handler = newFrame(oldFrame);
    handler.clearStack(); // clear the stack
    handler.push(interpreter.newExceptionValue(tryCatchBlock, handler, catchType)); // push the exception
    merge(insnList.indexOf(tryCatchBlock.handler), handler, subroutine); // merge two frames
}

Each instruction in the try block will do the following two things:

  • First, clear the stack and push the expected exception on the stack
  • Then, try to merge the current frame with the frame at the start of the catch block

Then, let's simulate the execution of instructions:

<init>:(Ljava/lang/String;)V
                               // {uninitialized_this, String} | {}
0000: aload_0                  // {uninitialized_this, String} | {uninitialized_this}   ──────── compatible ────────┐
0001: aload_1                  // {uninitialized_this, String} | {uninitialized_this, String} ─── compatible ──┐    │
0002: invokespecial   #8       // {this, String} | {} ──────── incompatible ────────┐                          │    │
0005: return                   // {} | {}                                           │                          │    │
                               // {uninitialized_this, String} | {Throwable} ───────┴──────────────────────────┴────┘
0006: ldc             #11      // {uninitialized_this, String} | {Throwable, String}
0008: invokestatic    #16      // {uninitialized_this, String} | {Throwable}
0011: athrow                   // {} | {}
                               // {uninitialized_this, String} | {}
0012: return                   // {} | {}

In the above snippet, the locals[0] at 0002 is this; however, the locals[0] at 0006 is uninitialized_this. These two values are incompatible. The Current Frame is the actual frame at a specific position, and the Stackmap Frame is the expected frame at another specific position.

IMHO, we should not catch the super() method.

A few little things:

  • The code in the MyMethodVisitor.visitEnd() should be placed in the visitMax() method. That's because the visitCode() method marks the beginning of the method body, the visitMax() marks the end of the method body, and the visitEnd() marks the end of the whole method.
  • The mv.visitTryCatchBlock() should be placed in the visitMax() method. If we put the mv.visitTryCatchBlock() in the visitCode(), it will invalid all other try-catch clauses.
  • There is already a return before the goto instruction. The following two-line code may be redundant:
mv.visitFrame(F_SAME, 0, null, 0, null); /* This line takes me more than 6 hours to figure out. Why this line can't be omitted? */
mv.visitJumpInsn(GOTO, catchEnd);

At last, to avoid inconsistency, it is recommended to use the COMPUTE_FRAME options.

lsieun
  • 11
  • 2
  • 1
    If we look at the types alone, merging `this` and `uninitialized_this` to `top` would be possible. Unfortunately, the frames also have incompatible *flags* which even specifying `COMPUTE_FRAME` can’t fix. Generally, I don’t agree with that last sentence. It’s a good thing if a developer tries to understand frames. And ASM’s `COMPUTE_FRAME` option is expensive and has [unavoidable limitations](https://stackoverflow.com/a/49262105/2711488). – Holger Oct 13 '21 at 11:50
  • @Holger I have seen quite a few java-bytecode-asm related questions and I really, really, really like your answers. – lsieun Oct 13 '21 at 12:22
  • @Holger I agree that `COMPUTE_FRAME` is painful. Sometimes it just crashed in the `getCommonSuperClass`. Here I have a quick question: what does `top` mean? Is it representing any variable? – Instein Oct 13 '21 at 17:19
  • 2
    @Instein `top` is the root of [the Verification Type System](https://docs.oracle.com/javase/specs/jvms/se17/html/jvms-4.html#jvms-4.10.1.2). So everything is assignable to `top`, which is basically an unusable entry. – Holger Oct 13 '21 at 17:24
  • @Isieun Thanks for pointing out issues in my code! I still have some questions: 1) what is the specific difference between "the end of the method body" and "the end of the whole method"? 2) Why "it will invalid all other try-catch clauses" if I put `mv.visitTryCatchBlock()` in the `visitCode()`? – Instein Oct 13 '21 at 17:33
  • 2
    @Instein this has to do with the order in which the visit methods are called. When you place `visitTryCatchBlock` in `visitCode`, it’s the first one, before any other `visitTryCatchBlock` call for the original code has been made. Since it’s the first then and matching the entire code and catching all throwables, it will always have precedence when an exception occurs. Likewise, `visitEnd()` is called *after* `visitMaxs` but all instructions (and exception handlers) should have been visited before the `super.visitMaxs(…)` call. – Holger Oct 13 '21 at 17:39
  • @Holger Thanks! I didn't know that `visitTryCatchBlock` has priorities. – Instein Oct 13 '21 at 18:01
  • 2
    @Instein well, after all, it has to have some rule to decide. It’s specified [here, within §2.10](https://docs.oracle.com/javase/specs/jvms/se17/html/jvms-2.html#jvms-2.10-420): “*The order in which the exception handlers of a method are searched for a match is important. […] At run time, when an exception is thrown, the Java Virtual Machine searches the exception handlers of the current method in the order that they appear in the corresponding exception handler table in the class file, starting from the beginning of that table.*” – Holger Oct 13 '21 at 18:08
  • I noticed that in the line before `0006` in this answer, the stackmap frame is `{uninitialized_this, String} | {Throwable}`. I am curious how `mv.visitFrame(F_SAME1, 0, null, 1, new Object[] {"java/lang/Throwable"});` produce such a frame. In [API](https://asm.ow2.io/javadoc/org/objectweb/asm/Opcodes.html#F_SAME1) it says "Opcodes.F_SAME1 representing frame with exactly the same locals as the previous frame...". However, what should be the **previous frame**? I thought the verifier can't know what instruction is executed before entering the catch block. – Instein Oct 13 '21 at 21:08