6

I have some method, which contains instrustion like ILOAD, and I want in some way to get value of stack after this instruction. Not just type, but exact value. I know I need to emulate method execution in order to do that, but I don't know how to make this properly.
I have such method for test called main:

sipush          15649
istore_0        /* c */
getstatic       java/lang/System.out:Ljava/io/PrintStream;
bipush          45
bipush          11
iload_0         /* c */
...

I want to get value, loaded by iload_0. I tried to make Analyzer and then see Frame values, but they only contains type of values, not exact what I want.

ClassReader cr = new ClassReader(new FileInputStream(new File("input.class")));
ClassNode cn = new ClassNode(Opcodes.ASM5);
cr.accept(cn, 0);

Iterator<MethodNode> methods = cn.methods.iterator();
while (methods.hasNext()) {
    MethodNode mn = methods.next();
    if (!mn.name.equals("main")) continue;
    AbstractInsnNode[] nodes = mn.instructions.toArray();
    Analyzer analyzer = new Analyzer(new BasicInterpreter());
    analyzer.analyze(cn.name, mn);
    int i = -1;
    for (Frame frame : analyzer.getFrames()) {
        i++;
        if (frame == null) continue;
        if (nodes[i].getOpcode() != Opcodes.ILOAD) continue;
        System.out.print(frame.getStack(0) + "|" + frame.getStack(1));
        System.out.print(" - " + nodes[i].getOpcode() + "\n");
    }
}

It shows me result: R|I - 21 How to get value as 15649? I tried to google that for hours and can't find anything useful. Thanks in advance.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
i0xHeX
  • 315
  • 2
  • 15

1 Answers1

13

Your code ignores the benefits of Java 5 almost completely. When you update it, you’ll get

for(MethodNode mn: cn.methods) {
    if(!mn.name.equals("main")) continue;
    Analyzer<BasicValue> analyzer = new Analyzer<>(new BasicInterpreter());
    analyzer.analyze(cn.name, mn);
    int i = -1;
    for (Frame<BasicValue> frame: analyzer.getFrames()) {
        i++;
        if(frame == null) continue;
        int opcode = mn.instructions.get(i).getOpcode();
        if(opcode != Opcodes.ILOAD) continue;
        BasicValue stackValue = frame.getStack(0);
        System.out.print(stackValue + "|" + frame.getStack(1));
        System.out.print(" - " + opcode + "\n");
    }
}

and you can see immediately that what you get is a BasicValue, which is suitable for verifying code or calculating stackmap frames, but not to get the actual values.

It’s a property of the interpreter, here BasicInterpreter, to maintain only BasicValues (hence the name). An alternative is the SourceInterpreter which allows you to track from which instructions a value may originate, which would be the istore_0 in your case, but this still doesn’t give you the actual value.

So if you want to get the actual value (if predictable), you need your own interpreter. A rather simple one, only tracking values which truly originate from pushing a constant would be:

import static org.objectweb.asm.Opcodes.*;
import java.util.List;
import java.util.Objects;
import org.objectweb.asm.Type;
import org.objectweb.asm.tree.*;
import org.objectweb.asm.tree.analysis.*;

public class ConstantTracker extends Interpreter<ConstantTracker.ConstantValue> {
    static final ConstantValue NULL = new ConstantValue(BasicValue.REFERENCE_VALUE, null);
    public static final class ConstantValue implements Value {
        final Object value; // null if unknown or NULL
        final BasicValue type;
        ConstantValue(BasicValue type, Object value) {
            this.value = value;
            this.type = Objects.requireNonNull(type);
        }
        @Override public int getSize() { return type.getSize(); }
        @Override public String toString() {
            Type t = type.getType();
            if(t == null) return "uninitialized";
            String typeName = type==BasicValue.REFERENCE_VALUE? "a reference type": t.getClassName();
            return this == NULL? "null":
                value == null? "unknown value of "+typeName: value+" ("+typeName+")";
        }
        @Override
        public boolean equals(Object obj) {
            if(this == obj) return true;
            if(this == NULL || obj == NULL || !(obj instanceof ConstantValue))
                return false;
            ConstantValue that = (ConstantValue)obj;
            return Objects.equals(this.value, that.value)
                && Objects.equals(this.type, that.type);
        }
        @Override
        public int hashCode() {
            if(this == NULL) return ~0;
            return (value==null? 7: value.hashCode())+type.hashCode()*31;
        }
    }

    BasicInterpreter basic = new BasicInterpreter(ASM5) {
        @Override public BasicValue newValue(Type type) {
            return type!=null && (type.getSort()==Type.OBJECT || type.getSort()==Type.ARRAY)?
                   new BasicValue(type): super.newValue(type);
        }
        @Override public BasicValue merge(BasicValue a, BasicValue b) {
            if(a.equals(b)) return a;
            if(a.isReference() && b.isReference())
                // this is the place to consider the actual type hierarchy if you want
                return BasicValue.REFERENCE_VALUE;
            return BasicValue.UNINITIALIZED_VALUE;
        }
    };

    public ConstantTracker() {
        super(ASM5);
    }

    @Override
    public ConstantValue newOperation(AbstractInsnNode insn) throws AnalyzerException {
        switch(insn.getOpcode()) {
            case ACONST_NULL: return NULL;
            case ICONST_M1: case ICONST_0: case ICONST_1: case ICONST_2:
            case ICONST_3: case ICONST_4: case ICONST_5:
                return new ConstantValue(BasicValue.INT_VALUE, insn.getOpcode()-ICONST_0);
            case LCONST_0: case LCONST_1:
                return new ConstantValue(BasicValue.LONG_VALUE, (long)(insn.getOpcode()-LCONST_0));
            case FCONST_0: case FCONST_1: case FCONST_2:
                return new ConstantValue(BasicValue.FLOAT_VALUE, (float)(insn.getOpcode()-FCONST_0));
            case DCONST_0: case DCONST_1:
                return new ConstantValue(BasicValue.DOUBLE_VALUE, (double)(insn.getOpcode()-DCONST_0));
            case BIPUSH: case SIPUSH:
                return new ConstantValue(BasicValue.INT_VALUE, ((IntInsnNode)insn).operand);
            case LDC:
                return new ConstantValue(basic.newOperation(insn), ((LdcInsnNode)insn).cst);
            default:
                BasicValue v = basic.newOperation(insn);
                return v == null? null: new ConstantValue(v, null);
        }
    }

    @Override
    public ConstantValue copyOperation(AbstractInsnNode insn, ConstantValue value) {
        return value;
    }

    @Override
    public ConstantValue newValue(Type type) {
        BasicValue v = basic.newValue(type);
        return v == null? null: new ConstantValue(v, null);
    }

    @Override
    public ConstantValue unaryOperation(AbstractInsnNode insn, ConstantValue value) throws AnalyzerException {
        BasicValue v = basic.unaryOperation(insn, value.type);
        return v == null? null: new ConstantValue(v, insn.getOpcode()==CHECKCAST? value.value: null);
    }

    @Override
    public ConstantValue binaryOperation(AbstractInsnNode insn, ConstantValue a, ConstantValue b) throws AnalyzerException {
        BasicValue v = basic.binaryOperation(insn, a.type, b.type);
        return v == null? null: new ConstantValue(v, null);
    }

    @Override
    public ConstantValue ternaryOperation(AbstractInsnNode insn, ConstantValue a, ConstantValue b, ConstantValue c) {
        return null;
    }

    @Override
    public ConstantValue naryOperation(AbstractInsnNode insn, List<? extends ConstantValue> values) throws AnalyzerException {
        List<BasicValue> unusedByBasicInterpreter = null;
        BasicValue v = basic.naryOperation(insn, unusedByBasicInterpreter);
        return v == null? null: new ConstantValue(v, null);
    }

    @Override
    public void returnOperation(AbstractInsnNode insn, ConstantValue value, ConstantValue expected) {}

    @Override
    public ConstantValue merge(ConstantValue a, ConstantValue b) {
        if(a == b) return a;
        BasicValue t = basic.merge(a.type, b.type);
        return t.equals(a.type) && (a.value==null&&a!=NULL || a.value.equals(b.value))? a:
               t.equals(b.type) &&  b.value==null&&b!=NULL? b: new ConstantValue(t, null);
    }
}

then, you may use it like

private static void analyze() throws IOException, AnalyzerException {
    ClassReader cr = new ClassReader(new FileInputStream(new File("input.class")));
    ClassNode cn = new ClassNode(Opcodes.ASM5);
    cr.accept(cn, 0);

    for(MethodNode mn: cn.methods) {
        if(!mn.name.equals("main")) continue;
        Analyzer<ConstantTracker.ConstantValue> analyzer
                = new Analyzer<>(new ConstantTracker());
        analyzer.analyze(cn.name, mn);
        int i = -1;
        for(Frame<ConstantTracker.ConstantValue> frame: analyzer.getFrames()) {
            i++;
            if(frame == null) continue;
            AbstractInsnNode n = mn.instructions.get(i);
            if(n.getOpcode() != Opcodes.ILOAD) continue;
            VarInsnNode vn = (VarInsnNode)n;
            System.out.println("accessing variable # "+vn.var);
            ConstantTracker.ConstantValue var = frame.getLocal(vn.var);
            System.out.println("\tcontains "+var);
        }
    }
}

This works with all load instructions not only ILOAD, i.e. ALOAD, LLOAD, FLOAD, and DLOAD

Of course, the interpreter has much room for improvements, e.g. for tracking trivial transformations like casts of int constants to short or byte or doing simple math, but I think the picture is clearer now and it depends on your actual use case, how much you want to track or interpret.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • It did the trick! However idk why, but I can't use generics, if I try to make something like `Frame = ...`, eclipse says, that type Frame is not generic, but in asm library `public class Frame`. Temporary changed all generics to simple classes with `Value` type and with casting to `ConstantValue` and this works. Some magic for me. You helped a lot! Now I can do more research in this class and understand it more clearly. Thanks :) – i0xHeX Feb 15 '18 at 12:39
  • And last thing, why we need to do subtract opcodes? Like `insn.getOpcode()-ICONST_0`. What that gives to us? – i0xHeX Feb 15 '18 at 12:40
  • 1
    Perhaps there’s something wrong with your classpath and you have another contradicting version of the ASM library in your path? The opcode arithmetic allows us to handle multiple opcodes at once, i.e. `ICONST_M1`, `ICONST_0`, `ICONST_1`, `ICONST_2`, `ICONST_3`, `ICONST_4`, `ICONST_5` are six different instructions dedicates to load `int` constants from `-1` to `5` to the stack. We can handle all at once when we subtract `ICONST_0` from the opcode, as that gives us the actual constant value. – Holger Feb 15 '18 at 12:48
  • Now I understand that, thanks. Also fixed generics issue. There was something wrong with asm-all-5.2 lib, imported splitted asm libs. – i0xHeX Feb 15 '18 at 12:59
  • Hello Holger. The ConstantTracker works fine, until a jump occurs with a known output. The known local variable values get lost due to the merge invocation, even though the jump target label itself is predicted (because we know the top stack value). Is there a way to fix this problem without rewriting the Analyzer class? – Aura Lee Apr 15 '20 at 17:19
  • I get IllegalStateException at basic initialize – Rans Oct 20 '20 at 21:11
  • 1
    @GraxCode You can expand the `merge` function to keep track of alternative values, but I’m afraid, selecting one of the alternatives based on predicted conditions has not been foreseen, so that would require substantial changes to the `Analyzer` class. – Holger Oct 21 '20 at 08:24
  • 1
    @Rans that sounds like being worth opening a new question, including the code to analyze and a full stack trace. – Holger Oct 21 '20 at 08:27