5

Does the Java Language Specification mandate that Java is compiled to Java byte code?

From what I understand, this is not the case:

JLS 1

Compile time normally consists of translating programs into a machine-independent byte code [representation.

[...]

The Java programming language is normally compiled to the bytecode instruction set and binary format defined in The Java Virtual Machine Specification, Java SE 9 Edition.

(emphasis mine)

I cannot find any other mentions of "byte code" or "bytecode" in the spec.

Does this mean that all bytecode manipulation is technically not covered by "Java the language" as defined by the JLS and technically relying on implementation details?

phant0m
  • 16,595
  • 5
  • 50
  • 82
  • 2
    To be verified but this is probably on the JVM specification, not in the language specification. Also, since Java 9, JShell (I think it is the name) allow to execute java code in a shell, without compilation. – AxelH Nov 29 '17 at 10:46
  • 1
    @AxelH "this" being bytecode manipulation? I don't think the JVM spec could specify that Java would *have* to be compiled to Java byte code. – phant0m Nov 29 '17 at 10:48
  • 1
    Indeed, the Java language can be compiled in any way. Although the [JVM specification](https://docs.oracle.com/javase/specs/jvms/se9/html/index.html) does cover bytecode. "The Java Virtual Machine knows nothing of the Java programming language, only of a particular binary format, the class file format. A class file contains Java Virtual Machine instructions (or bytecodes) and a symbol table, as well as other ancillary information." – bcsb1001 Nov 29 '17 at 10:48
  • @AxelH I haven't checked about JShell, but wouldn't that just be an artifact of the JDK distribution? Also, just because there's no explicit compile-to-bytecode step doesn't mean Jshell couldn't do that. – phant0m Nov 29 '17 at 10:50
  • @phant0m I find it more logic to find the bytecode specification on the tool that is using it. Can you elaborate on "_"this" being bytecode manipulation ?_" I don't get that. (JShell is just there FYI, this is most likely compiled, but in his own way) – AxelH Nov 29 '17 at 10:50
  • @AxelH as for the "this" in quotes, you said "To be verified but **this** is probably on the JVM specification". I was wondering whether you were referring to the first (does Java have to be...) or the second part (are bytecode manipulation libraries outside of "Java the language") of my question. – phant0m Nov 29 '17 at 10:53
  • I was referring to "_Does the Java Language Specification mandate that Java is compiled to Java byte code?_" – AxelH Nov 29 '17 at 11:08

2 Answers2

4

You noticed right that the term “normally”, as well as the absence of any byte code description in the JLS is intended to define the Java Programming Language as independent from the execution environment as possible. Still, it’s not that easy:

Relationship to Predefined Classes and Interfaces

As noted above, this specification often refers to classes of the Java SE platform API. In particular, some classes have a special relationship with the Java programming language. Examples include classes such as Object, Class, ClassLoader, String, Thread, and the classes and interfaces in package java.lang.reflect, among others. This specification constrains the behavior of such classes and interfaces, but does not provide a complete specification for them. The reader is referred to the Java SE platform API documentation.

Consequently, this specification does not describe reflection in any detail. Many linguistic constructs have analogs in the Core Reflection API (java.lang.reflect) and the Language Model API (javax.lang.model), but these are generally not discussed here. For example, when we list the ways in which an object can be created, we generally do not include the ways in which the Core Reflection API can accomplish this. Readers should be aware of these additional mechanisms even though they are not mentioned in the text.

So the Java Programming Language is more than the JLS, it’s also the Java SE platform API. And there, we have the defineClass methods of the mentioned ClassLoader class, accepting an input in the class file format. So even if we use other means of deployment than class files in the bytecode format, a fully compliant environment would have to support that format at this place. Note that Java 9 introduced another method accepting input in the class file format that doesn’t even require Reflection or implementing custom class loaders.

This rules out JavaME, which does not have these API artifacts mentioned by the JLS, otherwise, we already had an example of a Java environment not supporting bytecode manipulation.

But this still doesn’t fully answer the question whether bytecode manipulation is off-language, speaking of JavaSE or EE. Even if support for the bytecode format is provided by the standard API, bytecode manipulation depends on implementation details, either the Instrumentation API whose support is not mandatory or by processing compiled class files in their deployed form, as file hierarchy, jar files, or module files, neither being guaranteed to be the deployed form of the application (as said at the beginning). So it’s indeed impossible to implement a bytecode manipulation tool that is guaranteed to work with every possible Java environment, though you would have to go great lengths to create an environment that is fully compliant, but not working with these tools

Community
  • 1
  • 1
Holger
  • 285,553
  • 42
  • 434
  • 765
  • I'm currently browsing the JLS and currently glossing over JLS Chapters 12 and 13. It also talks about class loaders, and about the execution of a Java program in general from the perspective of a JVM, although it doesn't explicitly say that it has to be run on one. That seems to further underline your point that "it's not easy" – phant0m Nov 29 '17 at 14:00
1

The JSL don't have to know how it will be readed by the JVM, it only describe the Java Language. The compiler (JAVAC) provided in the JDK do the link, but it is not part of the language itself

Oracle's JDK software contains a compiler from source code written in the Java programming language to the instruction set of the Java Virtual Machine

In The Java™ Programming Language Compiler,, we can find the same explanation :

The Java programming language compiler, javac, reads source files written in the Java programming language, and compiles them into bytecode class files. Optionally, the compiler can also process annotations found in source and class files using the Pluggable Annotation Processing API. The compiler is a command line tool but can also be invoked using the Java Compiler API. The compiler accepts source code defined by the Java Language Specification (JLS) and produces class files defined by the Java Virtual Machine Specification (JVMS).

So the JAVAC command is mostly the bridge between to specs.

  • Input : a .java describe in the JLS
  • Output : a .class descrive in the JVMS.

You can find some information by checking in the Jave Virtual Machine Specification.

The Java Virtual Machine knows nothing of the Java programming language, only of a particular binary format, the class file format. A class file contains Java Virtual Machine instructions (or bytecodes) and a symbol table, as well as other ancillary information.

(I would like to find the opposite statement, that the Java Language knows nothing of the Java Virtual Machine language...)

Later in that specs, we found more information on the class format and how it the language was translated into that instructions list.

AxelH
  • 14,325
  • 2
  • 25
  • 55