14

I have rewritten the question (the question remains the same, just with less background noise) in hopes of creating less confusion directed at all the wrong things - due to this, some of the comments below may seem out of context.

Analyzing Java bytecode, what is the easiest way to find all the possible reference types given as parameters for a given Java bytecode instruction? I'm interested in the type of the reference, that is, that a given putfield instruction will receive an Integer, or that it might receive an Integer or a Float, etc.

For example, consider this code block:

   0:   aload_1
   1:   invokestatic    #21; //Method java/lang/Integer.valueOf:(Ljava/lang/String;)Ljava/lang/Integer;
   4:   astore_2
   5:   aload_2
   6:   ifnull  17
   9:   aload_0
   10:  aload_2
   11:  putfield    #27; //Field value:Ljava/lang/Number;
   14:  goto    25
   17:  aload_0
   18:  iconst_0
   19:  invokestatic    #29; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
   22:  putfield    #27; //Field value:Ljava/lang/Number;
   25:  return

We can deduce that the putfield instruction at pc 11 will receive a ref type of Integer.

0: aload pushes ref type of String (the method param)
1: invokestatic pops the ref type and pushes a ref type of Integer (invoked method return type)
4: astore pops the ref type of Integer and stores it in local variable 2
5: aload pushes the ref type of Integer from local variable 2
6: ifnull pops the ref type of Integer and conditionally jumps to pc 17
9: aload pushes "this"
10: aload pushes the ref type of Integer
11: putfield: we know we have a ref type of Integer that the instruction will put in field

Do any of the bytecode/code analysis libraries do this for me, or do I have to write this myself? The ASM project has an Analyzer, which seems like it might do part of the work for me, but really not enough to justify switching to using it.

EDIT: I have done my homework and have studied the Java VM Spec.

Sami Koivu
  • 3,640
  • 3
  • 24
  • 23
  • 5
    You want to compile .java-files. You don't have the dependencies. You extract the signatures from the compiled (.class) versions of the .java files.... uhm, why did you want to compile the java-files in the first place, if you had the .class to do the analysis on? – aioobe Jun 04 '11 at 10:48
  • @aioobe: OK, a valid question (duly upvoted), there are several scenarios. One is, I lost the .java file, and I want to change the class. Another one is that I may have never had the .java in the first place (decompilation). Yet another one I described in the question itself, when I'm creating a new class and I need the dependencies of a set of compiled classes just for the compiler. – Sami Koivu Jun 04 '11 at 14:57
  • 1
    I'm not really looking to validate the idea, that's what my PoC was for and it validated the idea for me nicely, although I'm happy to try and clarify. But at the end of the day, I presented the idea to help understand the focus of my question, which is the technical part about data flow analysis. That's where, dear StackOverflow, I could really use your help. – Sami Koivu Jun 04 '11 at 15:03
  • @EJP: Ouch. That has all the helpfulness of "you are ugly and your shoelaces are tied wrong". Instead, please help me improve my title. I'm not sure if my description is not very clear - the idea is not to compile, but instead to generate dummy classes as stand-ins for missing dependencies. – Sami Koivu Jun 05 '11 at 07:41
  • I rewrote the question and removed all the unnecessary background noise. – Sami Koivu Jun 05 '11 at 10:15
  • @EJP: I respectfully disagree. – Sami Koivu Jun 07 '11 at 00:43

3 Answers3

3

The Analyzer.analyze(...) method seems to do exactly what you need, and if not you've got the option of hacking it. This would be a better approach than starting over again.

Another idea would be to see if you can find a bytecode verifier that is implemented in Java. A verifier must use data flow analysis to ensure that methods don't get called with the wrong type of parameters.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
2

I have found need to do pretty much the exact same thing on a project of mine. You might want to take a look at the source code here (in the visitEnd() method). It uses an Analyzer from the ASM project to take a 'snapshot' of the stack frame at the time of a PUTFIELD instruction. Those snapshots are then stored, and can be retrieved once the visitor has finished, part of the information contained in the snapshot is the type of reference at the top of the stack.

The particular class linked to above is designed to be subclassed, an example of a subclass is here (check out visitMethod()). At the time I needed to do this, I turned to StackOverflow too, you may want to check out the question I asked at the time, particularly the link provided in the accepted answer, which provided the basis of the code I eventually used.

Community
  • 1
  • 1
Grundlefleck
  • 124,925
  • 25
  • 94
  • 111
1

We can deduce that the putfield instruction at pc 11 will receive a ref type of Integer.

You don't have to deduce that, it's part of the definition of putfield.

Before writing your application, you should spend some time reading the VM Spec. Section 6 will give you the specified behavior of all bytecode operations.

Anon
  • 11
  • 1
  • Thanks, +1 for the link to the VM Spec. A wonderful piece of documentation. It's good advice and I know the spec by heart. I have an open source project out there somewhere that does Java bytecode manipulation. – Sami Koivu Jun 07 '11 at 00:53
  • I disagree about your first point, though. I can extract from the putfield instruction only that it will take an object ref and a value from the stack, and store the value in the instance of that object ref in a field called value of type Number. What I want to know, is what just looking at the putfield instruction doesn't tell me: that due to the instructions before the putfield, there is an Object reference of type Integer on the stack. – Sami Koivu Jun 07 '11 at 00:59
  • I updated the question slightly to make my objective clearer. – Sami Koivu Jun 07 '11 at 01:21