How to programmaticaly decompile values into source literal?

Question

I'm busy with a simple decompiler for Android. I want to make a nice decompiled view. I use dex2jar.

Let's say I have a field declaration: public byte[] test = new byte[2]; With the FieldVisitor I get:

 public as modifier
 byte[] as type
 test as name
 and an Object as value <--

Is it possible if you have an object like byte[2], to get the new byte[2] literal back?

There is no 'source literal' here. – user207421 Jul 17 '12 at 22:41 — user207421, Jul 17 '12 at 22:41

millimoose · Answer 1 · 2012-07-18T11:41:05.500

The code:

public class Foo {
    public byte[] bar = new byte[3];
}

compiles to the same as:

class Foo2 {
    public byte[] bar;

    public Foo2() {
        this.bar = new byte[3];
    }
}

There is no "literal" here, field initialisers and initialiser blocks just get prepended (I think in source code order) to the code of every constructor – the information you're looking for isn't preserved. You'd have to look at the decompiled code of those constructors and analyze that somehow, but that'd be ambiguous.

The opcodes for this constructor are:

0:  aload_0
1:  invokespecial   #1; //Method java/lang/Object."<init>":()V
4:  aload_0
5:  iconst_3
6:  newarray byte
8:  putfield    #2; //Field bar:[B
11: return

The indices 4 through 8 correspond to the line this.bar = new byte[3];. They mean roughly:

Push the this reference onto the stack.
Push the integer 3 onto the stack.
Pop an integer (the 3) top of the stack, create a byte array of that length, push the array onto the stack.
Set the value that's on the top of the stack (the byte array) as the value of field #2 (that's bar) of the object that's second-from-the-top on the stack (this). (Also, pop the two off the stack.)

This doesn't really map to the original Java source very well; as you see, the part that corresponds to "new byte[3]" is inserted in the middle of the part that implements "this.bar = …" and things happen out of order even for an expression as simple as this. Reconstructing statements from bytecode probably isn't going to be trivial – they aren't delimited explicitly, a statement ends when you pop everything off a stack.

Okay, that's cool to learn! But in my dex2jar output I do get these values that are assigned to the fields. So should I guess they do the hard work for me and somehow process the constructor? Still, how would I get the String `"new byte[3]"` from a `byte[3]` object? — Peterdk, Jul 17 '12 at 23:20
@Peterdk I'm not sure, maybe the Dalvik format preserves this information, I'm going by what `javac` and `javap` tell me which should hold for `.jar`s. I'll edit my answer to include the relevant `javap -c output`. All of which is to say: seeing as the initialiser expressions can be pretty complex, and the compiled bytecode doesn't map to the original source code nicely, I can't really tell you an easy way to reconstruct the original expressions. You'll have to analyze the method code and work out the stack manipulations "backwards". — millimoose, Jul 17 '12 at 23:52

How to programmaticaly decompile values into source literal?

1 Answers1