How to delete break statements only from switch using ASM?

Question

I am using ASM framework to manipulate some java bytecode. I need to delete break statements only from switch instructions. My attempts deleted goto instructions from bytecode, but not only these ones connected with switch (for example all from class...).

What do you think about it?

I think this probably isn't going to do whatever it is you think you want it to do, independent of the question of recognizing just those gotos. What probably are you actually trying to solve? — keshlam, Mar 22 '14 at 14:58
This is a case to test. Nothing more. I managed to remove breaks from method that contains switch. But this isn't a solution because breaks can occur in many places. Not only in switch. I was thinking about connection between goto and switch instruction in bytecode, but without any results... — marcin, Mar 22 '14 at 15:07

score 0 · Answer 1 · answered Mar 28 '14 at 19:15

There are no explicit link between BREAK statements in the Java source code and anything in the Java bytecode. Some of the language constructs with BREAK statements might be compiled to GOTO opcodes, but I doubt you can establish a reliable link between them.

The only thing you can do is capture line numbers for BREAK statements in the Java sources (assuming those line won't have any other statements) and then using bytecode compiled with line number info you can find opcode(s) for those lines.

score 0 · Answer 2 · answered Apr 01 '14 at 19:05

A goto instruction belongs to a switch statement if appears in the context of one of the two switch bytecode instructions. The tricky part is to decide whether they represent a break. Both, lookupswitch and tableswitch, have a list of branch targets and if an instruction right before such a branch target is a goto instruction it may represent a break. This could be verified by checking whether all or at least most of such goto instructions have the same target which would be the next instruction after the switch statement. If you have identified the bytecode location of the instruction following the switch statement you can consider all gotos to that location to be a break;.

But such a heuristic can fail badly. Consider the following code:

outer: for( … ) {
  …
  inner: for(…) {
    switch(…) {
      case 1: …
        continue inner; // jumps to the next iteration of inner
      case 2: …
        continue outer; // jumps to the next iteration of outer
      case 3: …
       // a break: formally jumps to the end of the switch but since
       // there is no follow-up statement, most compilers will optimize
       // this to a jump to the next inner iteration just like <continue>
        break;
      case 4:
       …
       // no break but nonetheless will be followed by a <goto>
    }
  }
}

score 0 · Answer 3 · answered Apr 02 '14 at 13:41

In general, all non-exceptional, unconditional branches in Java code are compiled down to goto (or goto_w). That includes break statements, continue statements, unconditional loops, and any number of control flow patterns. You will not be able to derive any simple mapping between break statements in Java code to goto opcodes. You can determine which jumps act like a switch break by doing some control flow analysis, but it won't be perfect.

A good learning experience might be to pick apart the source code from a Java decompiler, as decompilers must reconstruct switch statements and figure out how to represent the jumps as break, continue, etc. Procyon and Krakatau are both open source. I wrote the former, but the code base is large and daunting, so it may not be the best choice.

How to delete break statements only from switch using ASM?

3 Answers3