1

I continue to research method emulation and getting actual value when passing instruction ILOAD. After Holger's help with Interpreter and after adding new operations with local variable in main() method I stucked with merge(V a, V b) method, which must be overriden when extending Interpreter.

@Override
public LocalValue merge(LocalValue valueA, LocalValue valueB) {
    if (Objects.equals(valueA, valueB)) return valueA;
    else return new LocalValue(basicInterpreter.merge(valueA.type, valueB.type), null);
}

But it seems this not correctly written. I can try different logic vars what to return but without understanding, in what cases values can merge, I can't find that. There is no useful info I tried to find in javadocs and asm-4 tutorial. So, what I need to return, when:
- One value is null, and other is not
- Both values are not null, same type, but different objects (such as 0 and 5)
- Both values are not null, different types

basicInterpreter:

private BasicInterpreter basicInterpreter = new BasicInterpreter();

LocalValue:

public static class LocalValue implements Value {
    Object value;
    BasicValue type;

    public LocalValue(BasicValue type, Object value) {
        this.value = value;
        this.type = type;
    }
    @Override public int getSize() {return type.getSize();}
    @Override public String toString() {return value == null ? "null" : value.toString();}
    @Override
    public boolean equals(Object obj) {
        if (!(obj instanceof LocalValue)) return false;
        LocalValue otherV = (LocalValue) obj;
        return Objects.equals(otherV.type, type) && Objects.equals(otherV.value, value);
    }
}
i0xHeX
  • 315
  • 2
  • 15

1 Answers1

0

Values need to be merged when an instruction can be reached through different code paths, e.g when you have conditionals, loops or exception handlers.

So when the value is the same, regardless of which code path has been taken, you can keep it, otherwise the value is not a predictable constant anymore. So in my code, where null has been used to denote unknown values, it always returns null when the values differ.

So when you have code like

void foo(int arg) {
    int i = 1;
    int j = arg%2==0? i: arg;
}

The values for arg, i, and the value on the operand stack get merged right before the assignment to j. arg does already have an unpredictable value, i has the value 1 in each code path, but the value on the operand stack, to be assigned to j has different values, 1 or “unknown”, depending on which code path has been taken.

You may decide to maintain a set of possible value, if you like, but when one of the possible values is “unknown”, the result of the merging cold be any value, hence, is “unknown”.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • So extending interpreter to support all instructions and track actual values is useless? My main target now is to make string deobfuscation, but not all arguments are constants and they computing right before pushing to the stack and calling decryption method. Is there ways to fully handle this? – i0xHeX Feb 16 '18 at 11:37
  • @i0xHeX the interpreter is called for every operation (see `unaryOperation`, `binaryOperation`, `naryOperation`, etc) with the actual instruction and the input values. Of course, if all input values are predictable and you understand the instruction, you can predict the result, which may turn the input of another operation into a predictable one. In case of conditionals, you have to do either, predict which branch will be taken or find out, why the actual path doesn’t matter for the result (think of `(random.nextBoolean()? 40: 100)%2` which is predictable). The complexity can grow arbitrarily… – Holger Feb 16 '18 at 11:46
  • Even if I will realize all this operations, how to realize merge in correct way? And second problem - method args. I do not provide it to analyzer. Now I'm working with no-args method, and only now it does not matter. If I see `binaryOperation` method for example, I see that it can be called on `iaload`, `caload` etc instrustions, I can understand, what I need to do, but `merge(..)` gives me only 2 objects, no more. – i0xHeX Feb 16 '18 at 12:10
  • You may track the source instruction node along with the actual value, to make a decision, or you speculatively use either of the values and use multiple passes to find out whether there is an actual impact on the value you’re interested in. For ordinary Java code you can get very far with this, for code designed to be hard to analyze, this may end up with an exploding complexity way beyond the scope of stackoverflow.com… – Holger Feb 16 '18 at 12:26