0

tldr; What makes it necessary to include a transitive reference into the build path?

Explanation

I try to analyse some Java sourcecode that is compiled with the eclipse workspace. There are loads of projects in the workspace.

I try to detect unused references between projects.

My first approach was to go through all projects and take all the references from the .classpath files and then analyse all .java files in the same project. If a .java file has an import statement from another project than the the reference to this other project is necessary. That way I found some references in the classpath that had no "justification".

But when I deleted those references the build broke. One such case is, if a referenced class is extending a class from a third project. This project is referenced transitivly and needs to be in the build path.

I'm wondering what other kinds of relationships (other than inheritence) result in transitive references in the classpath? And what about multiple inheritence layers with each class of the inheritence lying in another project?

KFleischer
  • 942
  • 2
  • 11
  • 33

2 Answers2

0

Given

  • a ProjectA with a ClassA that references a ClassB from Project B
  • a third ProjectC with a ClassC

In those cases you need to reference ProjectC from projectA

  • if ClassB inherits/implements from classC and ClassA calls a method from ClassB.
  • if ClassA calls(!) a method from ClassB that has a signatures that contains ClassC (parameter or return value) (this is also true for Exceptions!)

Further, not all References of ProjectA can be found by looking at the import statements of ClassA.

  • If ClassC is used by ClassA and lies in the same package as ClassA, ClassC will not be listed as import in ClassA.
  • If ClassC is used by ClassA by its full name (incl. package name) than no Import statement is created.
KFleischer
  • 942
  • 2
  • 11
  • 33
0

What makes it necessary?

The enforced rules of Java compilation make it necessary. Java needs to know what is in the domain of code in order to validate the rules are followed. The domain is the set of jar files in your classpath. If you implement an interface, the non-abstract implementing class or the non-abstract children of a class implementing an interface need to implement the interface. Override annotations need to be satisfied. References to other methods need to be visible, from both the code perspective and the human perspective. If a method uses a method in a different class or interface, the signature of the method needs to be known by the calling class.

How would I approach it?

I'll use "classifiers" to refer to both classes and interfaces. You want to start with a set of known validated classifiers and known unvalidated classifiers.
Start with the known unvalidated classifiers are all the classifiers in your target project and known validate classifiers is empty.

Go through each of the unvalidated classifiers and identify any classifiers used that are not already known and add them to the unvalidated classifiers set.

Validate that the current unvalidated classifier is on the classpath. Error if not found.

Move the processed classifier from unvalidated to validated.

Repeat this process until all classifiers have been validated.

Classifiers used includes field types, method return types, method parameter types, variable types in parameters, extended classes, implemented interfaces, annotations, and in some cases Javadoc references. Depending on your project rules, going solely by imports could miss some things. An interface may be in the same package but sourced from a different project.

Why do reflection and Spring-configuration need to be out of scope?

Such actions use text-based handoffs to get code done. Reflection can create a classes using the string "Package.classname" without being identified as a "used class". These concepts would require extra handling beyond the scope of a pure source code analysis or bytecode analysis.

If you want to dive deeper, the Java Virtual Machine specification can give you a deeper understanding of what a class file needs to run. Specifically "4. The class File Format" in the Java 10 spec and "5. Loading, Linking, and Initializing". https://docs.oracle.com/javase/specs/

ProgrammersBlock
  • 5,974
  • 4
  • 17
  • 21
  • Thinking back on it, "Classifiers used" may also need every method call inside a class's implementation because the class file records the source classifier in the bytecode. Meaning if you have A.b().c().d(); then the return type of b() and c() also need to be accounted for even though no variable or field is created for them. – ProgrammersBlock Jul 11 '18 at 21:45
  • Thanks for that algorithm. But I wanted to know what cases result in transitive dependencies (or non declared import statements) as I didn't want to build a compiler. – KFleischer Jul 18 '18 at 15:32