-9

Given a classpath (e.g. a set of jar files) I would like to know, do any of these jar files make a method call (ignoring reflection) to a method which doesn't exist within the class path.

For example if I had only foo.jar on my class path and it has a class which makes a call to com.bar.Something#bar(String) and that did not exist in foo.jar then I would be told the method doesn't actually exist.

Luke
  • 884
  • 8
  • 21
  • 1
    To the closers: What detail are you missing? The question is perfectly clear. An answer would presumably mention that jars can be read, class files are inside and can be read with e.g. ASM or bytebuddy, and INVOKEVIRTUAL opcodes can be found, or probably easier: Just scan the constant pool. These method refs can then be looked for in a set of jars. It's not _easy_, but what OP wants is perfectly clear. Voted to reopen. – rzwitserloot Mar 30 '21 at 00:53
  • I've written something like this in the past, using ASM (not certain whether that's still the best option). Note that there's a pattern for writing backward-compatible code where you call a method which may not exist, catch the exception and do something else, so you can get false results if your code may do this. – tgdavies Mar 30 '21 at 00:58
  • I can't believe that this question was closed for lack of clarity what a poor excuse. That aside, I don't know enough about ASM to answer if that is how I should solve this problem or not. – Luke Mar 30 '21 at 01:00
  • 1
    @rzwitserloot *"What detail are you missing? The question is perfectly clear."* I'm missing an actual question. It is entirely unclear to me what is being asked here. --- Is OP asking for a tool that does this? Off-topic. --- Is OP asking us to write the code for this? Off-topic. --- Is OP asking for a library to help read .class files? Off-topic. --- What is OP asking for, that is not off topic? --- *(FYI: I'm not a closer, but I am a down-voter, since the question is unclear)* – Andreas Mar 30 '21 at 02:25
  • @Andreas I think it is worth reading rzwitserloot fantastic answer since it really does do a great job of answering the question with what is available in the Java eco system. Hopefully in the future you and others will be less keen to down-vote/close hard to answer questions since doing so is not helpful. – Luke Mar 30 '21 at 02:35
  • 1
    @Luke The fact the rzwitserloot *guessed* what you're asking does not negate the fact that the question itself is very unclear, and not up to the standard of StackOverflow. We all down-voted and/or closed the question to hint/force you to **improve** the question quality, a task that you've failed at. See: [How do I ask a **good** question?](https://stackoverflow.com/help/how-to-ask) --- We care about the quality of this site, which is *why* we are *"keen to down-vote/close"*. It is *not* because the question is hard to answer. – Andreas Mar 30 '21 at 02:51
  • Your suggestions all require that I have knowledge of what the answer should be but I ask the question because I don't know what the answer is. – Luke Mar 30 '21 at 02:58

1 Answers1

2

There are no tools that I am aware of that do this, and a JVM will not just blindly load all classes contained on its class path on boot. It just loads whatever you told it is the main class, and whenever it loads a class, it checks which other classes it needs to load in order to make sense of the signatures contained within (so, field types, whatever it extends or implements, method return types, method parameter types, and method exception types - any such classes are immediately loaded as part of loading a class if any such types aren't already loaded) - and it loads classes needed to execute a statement, but only when such a statement is actually run. In other words, java (the VM) loads lazily. You cannot use it for this purpose.

What you can do is rather involved. Let's first tighten what you're asking for:

  1. Given a 'set of source jars' (source), verify each class file contained within.
  2. To verify a class, find all method and field accesses contained within all classes within source, and ensure that the mentioned field/method access actually exists, by comparing against a 'set of target jars' (target). Source and target may or may not be the same. For convenience you may wish to silently extend target to always include source.

Any attempt to use the VM's classloading abilities (e.g. you load classes in with reflection directly) is problematic: That will run static initializers and who knows what kind of nasty side-effects that's going to have. It'll also be incredibly slow. Not a good idea.

What you'd want is not to rely on the VM itself, and to handroll your own code to do this; after all, class files are just files, you can read them, parse them, and take action based on their contents. Jar files can be listed and their contents can be read, from within java code - not a problem.

The class file format is well described in the JVM Specification but is a very complicated format. I strongly suggest you use existing libraries that can read it. ASM comes to mind.

In practice, any method invocation is encoded in a class file using one of a few 'INVOKE' opcodes (normal method calls are INVOKEVIRTUAL or INVOKEINTERFACE, static methods are INVOKESTATIC, constructors and initializers are INVOKESPECIAL. Field accesses (you did not mention this, but if you're going to verify for existence of referenced entities, surely you'd also want to take fields into account) are GETFIELD and SETFIELD.

However, all of these opcodes do not then immediately encode in full what they are referring to. Instead, they encode merely a small index number: That number is to be looked up in a class file's constant pool, where you find a fully qualified specification for what method/field is actually being referred to. For example, invoking, say, ArrayList's 'ensureCapacity' method is named, in class file format, as a constant that itself refers to 2 string constants: One string constant contains the value "java/util/ArrayList", the other contains the value "ensureCapacity(I)V". (I is class-file-ese for the primitive int type, and V is representing the return type; V is class-file-ese for void).

Therefore, there is an easy shortcut and there is no need to parse the bytecode contained in a class file. Just scan the constant pool - all you need to do is verify that every method and field ref in the constant pool is referring to an actual existing method or field.

With sufficient knowledge of the class file internals (I covered most of what you need to know here already), and some basic experience with the ASM library, you should be able to write something like this yourself, using ASM, in a span of a day or so. If this is all greek to you, it'll no doubt take perhaps a week, but no more than that; a medium sized project at best.

Hopefully these are enough pointers for you to figure out where to go from here, or at the very least, to know what it would take and what you may want to search the web for if you don't want to write it yourself but still hold out hope that someone already did the work and published it as an open source library someplace.

NB: There are also dynamic invocations which are a lot more complicated, but by their nature, you can't statically verify these, so presumably the fact that you can't meaningfully interact with INVOKEDYNAMIC based method invokes is not relevant here. Similarly, any java code that uses the java.lang.reflect API obviously doesn't use any of this stuff, and cannot, mathematically provably even, be verified in this fashion. Thus, no need to worry about doing the impossible.

rzwitserloot
  • 85,357
  • 5
  • 51
  • 72
  • 1
    Dynamic invocations are not a problem per se. Their arguments, including the bootstrap method, are also in the constant pool, so you can verify their existence. Only if the bootstrap method itself uses reflective lookups, it leads to the same problem as reflective lookups impose in general. But you also have to build a type tree to identify inherited methods. This work has been done by every obfuscation tool already and I think, all of them are capable of warning about references to unknown methods. – Holger Mar 31 '21 at 11:41
  • "Dig into obfuscators because they do this kinda work" - that's an excellent insight, @Holger! I think most of the obfuscators are unfortunately not free and open source, but perhaps some exist, and OP can look at them for code, or at least for inspiration. – rzwitserloot Mar 31 '21 at 12:06