1

I want to obtain index number of a bytecode in a method when visiting this bytecode. For example, given a bytecode sequence below, the index number for the invokevirtual is 7 (The method body is visited with SKIP_DEBUG).

public calculate(IILjava/lang/String;J)J
   L0
    LINENUMBER 17 L0
    ICONST_3  //0 
    ISTORE 6  //1
   L1
    LINENUMBER 18 L1
    LDC 10.0  //2
    DSTORE 7  //3
   L2
    LINENUMBER 19 L2
    ALOAD 0  //4
    GETFIELD code/sxu/asm/Callee._call2 : Lcode/sxu/asm/Callee2; //5
    LDC "xushijie"  //6
    INVOKEVIRTUAL code/sxu/asm/Callee2.sayHello (Ljava/lang/String;)I  //7
    ISTORE 9

}

My code is like:

ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES|ClassWriter.COMPUTE_MAXS);

    cr.accept(new SomeMethodVisitor(api, owner, access, name,   desc, cw.visitMethod(access, name, desc, owner, null)), SKIP_DEBUG);

class SomeMethodVisitor extends MethodVisitor{
            @Override
            public void visitMethodInsn(final int opcode, final String owner,
                    final String name, final String desc, final boolean itf) {
                int index = ???  //Get the current  bytecode index number here.
                super.visitMethodInsn(opcode, owner, name, desc, itf);
            }
}

This is relative easy with Tree-based ASM API, where we can use:

 class MethodNode{
     public InsnList instructions;
}

But I do not have a good solution inside of event-based mode. Also, it is not a good solution to override all visitXXX methods of MethodVisitor and counts all bytecodes that have already past.

shijie xu
  • 1,975
  • 21
  • 52
  • 1
    I don't think you can find the index automatically in the core API. Implementing a counter in all `visitXxxInsn` methods seems to be a good solution. (It's not hard, just a bit tedious. You can copy most of the code from `CodeSizeEvaluator` as Charles has pointed out.) – dejvuth Aug 06 '15 at 14:23
  • 1
    I agree with @dejvuth that this is simply not possible. Perhaps more importantly, what problem are you trying to solve by knowing the instruction index? I could imagine the bytecode offset being useful (though it's also unavailable) for diagnosing VerifyError, but not the instruction index. – Brett Kail Aug 10 '15 at 16:19

1 Answers1

2

This is actually so tricky with ASM that it's almost worth using Javassist instead for this function.

There is a subclass of MethodVisitor called CodeSizeEvaluator. If you subclass it, you can sort of get a running total of the bytecode size (and, hence, the offset into the bytecode). Why "sort of"?

Some bytecode operations can vary in size. For example, if you want to push an integer constant onto the stack, you can do so with one, two, or three bytes of instruction code depending on the size of the integer. ASM holds an operational abstraction of the bytecode. In other words, for that instruction, it will maintain a node that says, "Push an integer on the stack of value X." The Classwriter will decide how to perform this in actual bytecode. For this reason, the CodeSizeEvaluator doesn't really know whether that instruction will be 1, 2, or 3 bytes. This is why it maintains a "min" or a "max."

You can augment the logic by looking at the value of X and determining which actual instruction will be used and picking an "actual" size. The are a couple of cases where this is tricky. These are when you have table jumps (i.e. case statements). These are padded so they align to a 4-byte boundary. Knowing the alignment requires knowing the bytecode count where they start (you should have that). The harder one are jumps and gotos. The Classwriter handles cases where the jump offset is more than +/-32K by inserting code with a long goto. This pretty much never happens unless you are doing something insane. Nevertheless, the CodeSizeEvaluator will allow for the max of 7 bytes for a branch. You will probably have to assume a typical case and calculate for 3 bytes.

I hope this helps.

P.S. This is from memory, so I may have forgotten a couple of other edge cases.

Charles Forsythe
  • 1,831
  • 11
  • 12