1

I'm trying to prevent conditionals jumps from being useless for example by deleting that :

if(1 > 10)  {
   return;
}

So I decided to create a BasicInterpreter to check if the frame of the jump is null and if so remove it. And it's not null so it doesn't detect it as useless.

Code which doesn't work :

Analyzer<BasicValue> an = new Analyzer<BasicValue>(new BasicInterpreter());
Frame<BasicValue>[] frames = an.analyze(cn.name, mn);

for (int i = 0; i < frames.length; i++) {
    if ((mn.instructions.get(i) instanceof JumpInsnNode)) {
        if (mn.instructions.get(i).getOpcode() >= IFEQ && mn.instructions.get(i).getOpcode() <= IF_ACMPNE) {
            if(frames[i] == null)  {
                System.out.println("This jump is useless");
            }
        }
    }
}

Then I tried to get some values of stack to manually calculate but without any success (I found that but I cannot port the code to use it on jumps ASM Get exact value from stack frame):

Analyzer<BasicValue> an = new Analyzer<BasicValue>(new BasicInterpreter());
Frame<BasicValue>[] frames = an.analyze(cn.name, mn);

for (int i = 0; i < frames.length; i++) {
    if ((mn.instructions.get(i) instanceof JumpInsnNode)) {
        if (mn.instructions.get(i).getOpcode() >= IFEQ && mn.instructions.get(i).getOpcode() <= IF_ACMPNE) {
            // getStackSize() returns 2 so -> 0 and 1
            frames[i].getStack(0); // this is probably 1
            frames[i].getStack(1); // this is probably 10
            // but it returns a BasicValue so I can't check if the code works or not (we cannot get values)
        }
    }
}

And the last thing I tried was to get the size of the instructions which are used by the jump to delete them (of course it doesn't detect if it's a useless code but I can at least delete it).

In fact I tried to create a method which returns a constant int so I can detect if getValue is called in the instructions of the jump (if I detect the invoke, I delete the instructions of the jump and the jump itself of course):

Example:

if(1 > getValue())  { //getValue() returns 10
   return;
}

Code:

Analyzer<BasicValue> an = new Analyzer<BasicValue>(new BasicInterpreter());
Frame<BasicValue>[] frames = an.analyze(cn.name, mn);
ArrayList<AbstractInsnNode> nodesR = new ArrayList<>();

for (int i = 0; i < frames.length; i++) {
    if ((mn.instructions.get(i) instanceof JumpInsnNode)) {
        if (mn.instructions.get(i).getOpcode() >= IFEQ && mn.instructions.get(i).getOpcode() <= IF_ACMPNE) {
            ArrayList<AbstractInsnNode> toRemove = new ArrayList<>();

            for (int ia = 1; ia < frames[i].getMaxStackSize() + 2; ia++) { // I started from 1 and added 2 to getMaxStackSize because I wasn't getting all the instructions
                toRemove.add(mn.instructions.get(i - ia));
            }

            toRemove.add(mn.instructions.get(i)); // I add the jump to the list

            for (AbstractInsnNode aaa : toRemove) {
                if (aaa.getOpcode() == INVOKESTATIC) { // the invokestatic is getValue
                    for (AbstractInsnNode aaas : toRemove) {
                        nodesR.add(aaas);
                    }
                    break;
                 }
            }
        }
    }
}

for (AbstractInsnNode aaas : nodesR) {
    mn.instructions.remove(aaas);
}

} catch (AnalyzerException e) {
    e.printStackTrace();
}

This code is probably horrible and not optimized but I tried a LOT of things without any success. The getMaxStackSize() doesn't return a number which is 100% correct (sometimes it doesn't take additions etc so it deletes instructions such as labels etc...).

What I'm trying to do:

Parsing through a method and check if a conditional jump will always be false (so the code inside will never gets executed) then remove it. I tried two different way:

  • Use a BasicInterpreter to check if this jump will get executed with constant values then try it to see if it will always be false
  • Check if the jump contains a certain method (for example getValue() which returns 10 and compare if it's less than 1) then remove it What I don't understand :
  • I think that there is frames in each instructions and that it contains the local variables table and the values that the frame is using – StackMap ? - ( for example if the instruction compare if an int is less than another it would return [II] right ?
  • I don't know if I can use a BasicInterpreter to test if the two constant ints always return the same result
  • StackMap = the Stack ? or it's different like the StackMap is a part of the stack which contains the needed values for the instruction ?
Holger
  • 285,553
  • 42
  • 434
  • 765
outrage
  • 15
  • 5
  • 1
    I think you are confused about what stackmaps do and how bytecode works. Could you please describe what you wish to accomplish at a high level? – Antimony Mar 14 '19 at 01:51
  • I added to my post the answer for what you said because it was too long for a comment (and you're right). – outrage Mar 14 '19 at 07:26
  • You’ve already found and linked the answer by yourself. Now you have to read the answer carefully. E.g. the part where it says “*…a `BasicValue`, which is suitable for verifying code or calculating stackmap frames, but not to get the actual values*”. Without integrating the interpreter of that answer (or writing something similar yourself), you won’t get any values. It’s not clear why you think, a frame would be `null` at any point. – Holger Mar 14 '19 at 08:10
  • I think it's worth pointing out that runtime optimization in systems like Hotspot will do this for you, albeit at runtime with some cost. The value of optimizing the code in the class file depends on the runtime target. – Charles Forsythe Mar 14 '19 at 12:19
  • @Holger Ok so I did what the answer said and I can get the values if it's constant (bipush, iconst, etc) but if it's a method which always return the same result I still need to execute it to get the result (I don't know how but I will try to figure it out) or ASM can predict it ? Edit : The main question is : Can ASM provide the calculated used values (the **result** of 4*4 or the **result** of a method) or only give the used values (just 4*4 and the invoke instruction) ? Or maybe ASM can test the jump ? – outrage Mar 14 '19 at 21:46
  • As said in that answer, you can extend the solution, e.g. to do math instruction if their arguments are predictable, to make their result predictable. The applies to method invocations as well, but in case of `invokevirtual` and `invokeinterface`, you first have to determine whether the actual target method is predictable, i.e. the invocation can not end up at an overriding method. If predictable, you can use the same algorithm to analyze the target method and check whether its return value is predictable, to substitute the invocation with the value when possible. – Holger Mar 15 '19 at 07:42
  • @Holger Ok but can I get the size of the stackmap to delete the jump ? I mean if I detect that it's deadcode, can I get the size of the jump to delete it automatically or do I need to search by myself to determine its size (the delete it). – outrage Mar 18 '19 at 19:33
  • The “size of the jump” is irrelevant. When you use ASM, the [`JumpInsnNode`](https://asm.ow2.io/javadoc/?org/objectweb/asm/tree/JumpInsnNode.html) will have a reference to the `LabelNode` identifying the target. You will find that label node within the instruction list. But just for clarification: when a branch is predictably never taken, all you can remove, is the branch instruction itself and the preceding instructions related to evaluating the condition. Only when a branch is always taken, is a forward branch, and there is no other code flow entering the instructions, you may remove them. – Holger Mar 19 '19 at 12:39
  • @Holger Sorry, I misspoke, I meant the size of the instructions used by the jump (iconst_1 -> iconst_2 -> if_cmplt will return the two iconst). The "preceding instructions related to evaluating the condition". If there is just 2 iconst it's easy to get but if there is some operations or some invokes, I think it may be harder. – outrage Mar 19 '19 at 17:33
  • Yes, that’s hard, as there can be arbitrary instruction sequences producing the test condition. The good news is, you don’t need to support every possibility, as you are going to elide only those tests, whose outcome you can predict, anyway. Then, you have to care for side effects, e.g. for a statement like `if((someVar=expression)==value) …` you have to keep the expression evaluation and assignment if `someVar` is/might be subsequently used. You have to consider which scenarios you want to handle and draw the line somewhere. – Holger Mar 20 '19 at 07:57

0 Answers0