7

I have downloaded the javac source code from here and I found that the it itself is written in java language. However, I was expecting that it was written in c/c++.

Anyway, how does this java compiler source code written in java compiled when there was no java compiler?

Starx
  • 77,474
  • 47
  • 185
  • 261
bob zhou
  • 103
  • 1
  • 7
  • 3
    http://en.wikipedia.org/wiki/Bootstrapping_%28compilers%29 – nneonneo Oct 07 '12 at 07:05
  • 4
    It is a common practice that a compiler to a language is written in its language, to demonstrate its abilities. AFAIK, for these reasons - many don't consider VB as a "real" language. – amit Oct 07 '12 at 07:05
  • This is the bootstrap question. Answer is that the _initial_ compiler was written in something else, and then rewritten in Java when powerful enough. – Thorbjørn Ravn Andersen Oct 07 '12 at 07:05
  • @amit: Seriously? VB could easily compile itself. That's definitely not why it's 'not considered a real language'. – nneonneo Oct 07 '12 at 07:08
  • @nneonneo: 'it can' and 'that how it was first implemented' are two different things. This is at least the impression I got from my programming languages lecturer a few years ago in UNI. – amit Oct 07 '12 at 07:09
  • As Ravn said, java compiler was originally written in C, later moved to Java when powerful enough. JVM from Sun is developed in C too. This link can help to explore few things http://stackoverflow.com/questions/1220914/in-which-language-are-the-java-compiler-jvm-and-java-written – Jimmy Oct 07 '12 at 07:09
  • @nneonneo, Isn't `.class` file which runs on JVM? – Starx Oct 07 '12 at 07:11
  • 2
    @amit: No language can possibly be first implemented in itself. VB's no different. Your PL lecturer is biased, probably as he does not consider VB suitable for 'real' programming tasks. Nonetheless, it is in fact a real programming language, and people use it to solve real problems (even if you wish they didn't). – nneonneo Oct 07 '12 at 07:12
  • @Starx: and why did you not reply to my comment on your answer? Anyway, the JVM compiles nothing (unless you count JIT); `javac` takes human-readable textual source code ("Java" code) and turns it into a binary form. It is therefore a compiler for Java source, and a compiler implementing the Java programming language. – nneonneo Oct 07 '12 at 07:13
  • 1
    @nneonneo Actually, I consider the JVM runtime's JIT to be the only "real" compiler there is. `javac` does not optimise the bytecode that is emitted, and for good reason: such optimisations are best left for the JIT compiler. – C. K. Young Oct 07 '12 at 07:15
  • @ChrisJester-Young: Some JVMs don't do JIT (e.g. because nobody bothered to do it for that architecture yet). The *Java language specification* tells you what compilation means. What the JVM does is up to the JVM implementer. – nneonneo Oct 07 '12 at 07:17
  • @nneonneo In theory, perhaps. But in practice, any practical JVM implementation will JIT, because otherwise the execution will be horribly inefficient. – C. K. Young Oct 07 '12 at 07:18
  • Or, some JVMs could just skip JIT and [execute bytecode right on the processor](http://en.wikipedia.org/wiki/Jazelle). – nneonneo Oct 07 '12 at 07:20
  • @nneonneo In that article, it explains that the bytecode actually undergoes binary translation to native ARM instructions. So, yes, that is a form of JIT, albeit a somewhat more lightweight version compared to what you get with HotSpot's `-server` mode. – C. K. Young Oct 07 '12 at 07:22
  • 2
    If we're going to get really technical...x86 is also "JIT" by that definition, because real processors execute x86 instructions as sequences of microcode. Would you then call C compilers "not real" because they don't generate microcode? – nneonneo Oct 07 '12 at 07:26
  • 2
    @ChrisJester-Young: What definition of "compiler" are you using that would exclude javac? Take the Wikipedia entry, for example (not claiming this as an authority, just an example): "A compiler is a computer program (or set of programs) that transforms source code written in a programming language (the source language) into another computer language (the target language, often having a binary form known as object code)." `javac` certainly counts by that definition. – Jon Skeet Oct 07 '12 at 07:27
  • @JonSkeet I didn't say that javac isn't a compiler. I'm saying that it's not appropriate to discount HotSpot's JIT as a compiler, since it's got much more complicated compilation machinery compared to javac. – C. K. Young Oct 07 '12 at 07:28
  • @nneonneo, I was not sure if that was the right answer. I thought and turn out what you are saying is the same thing. `javac` compiles source to executable code that runs on JVM. – Starx Oct 07 '12 at 07:28
  • 5
    @ChrisJester-Young: You said (to quote): "Actually, I consider the JVM runtime's JIT to be the only "real" compiler there is." That explicitly excludes `javac` from being a "real" compiler. That's not *at all* the same statement as saying "the JIT is a compiler as well". If you're going to start creating distinctions between "compiler" and "real compiler" then we'll need definitions for *two* terms... – Jon Skeet Oct 07 '12 at 07:29
  • @JonSkeet That was indeed badly-worded on my part. I had issue with nneonneo's assertion that "the JVM compiles nothing (unless you count JIT)", which I felt diminished the JIT compiler's (much more intense than javac) compilation, so my response was basically saying, hey, if you want to say which one is the "realer compiler", well.... – C. K. Young Oct 07 '12 at 07:31
  • 2
    @ChrisJester-Young: They're simply *different* compilers. If we're talking about the compiler for the Java *language*, javac is what there is. The JIT compiler *wouldn't* count there. If we're talking about "all the compilers used in the process of executing code which starts out as Java" then I'd definitely include the JIT, in systems which use one. – Jon Skeet Oct 07 '12 at 07:33
  • @JonSkeet Yes, that's a better way of looking at it, I agree. Then you can say, for the ".java-to-.class phase", javac is the compiler in use, but for the ".class execution phase", then the JIT compiler kicks in, etc. And then there are other phases, like microcode, etc. – C. K. Young Oct 07 '12 at 07:38
  • @ChrisJester-Young Actually, the CPU-instruction-to-microcode translation cannot be termed "compilation" is it is not ahead-of-time. The term that fits that process would be "interpretation". – Marko Topolnik Oct 07 '12 at 08:39
  • @MarkoTopolnik Well, JIT compilation is not ahead-of-time, either. I don't actually know the specifics of the instruction-to-microcode translation, but I understood that it has a lot of smarts behind it. My choice of terminology here, of course, is that compilation == smart and interpretation == dumb. And not everyone is going to think of those terms quite that way. :-) – C. K. Young Oct 07 '12 at 13:25
  • @ChrisJester-Young JIT is ahead-of-time because it compiles a whole method at once and basically substitutes the native code for the bytecode in later invocations. CPU is a classic interpreter---it only works on the exact instruction(s) it is going to execute next. Also, to qualify as a compiler, the CPU would have to save the compiled code and later refer only to it, skipping the actual machine instructions. – Marko Topolnik Oct 07 '12 at 13:32
  • @MarkoTopolnik Fair enough (and of course, at the instruction level, there's no concept of methods). To my very limited understanding (since I currently know next to nothing below the instruction level), with the instruction cache, the possibility of "[referring] only to [the compiled microcode], skipping the actual machine instructions" is, in theory, there, even if current processors don't actually do it. But like I said, I don't know what actually happens in reality, so I'm happy to accept what you've said. – C. K. Young Oct 07 '12 at 13:38

3 Answers3

4

From here :

The very first Java compiler developed by Sun Microsystems was written in C using some libraries from C++

Besides the compiled bytecode is interpreted by JVM which is written in c++. From here:

The Oracle JVM, named HotSpot, is written in the C++ language

Starx
  • 77,474
  • 47
  • 185
  • 261
loxxy
  • 12,990
  • 2
  • 25
  • 56
  • As a matter of fact, the Wikipedia article is disputed at this point, since it says that there should be a citation or reference that can demonstrate that this statement is true. Although I do not doubt it, the reference is, perhaps, not the best. – Edwin Dalorzo Oct 07 '12 at 07:14
  • Not all JVMs are written in C++, though. – nneonneo Oct 07 '12 at 07:14
  • @nneonneo well in this case, since the compiler is mentioned, it is c++. – loxxy Oct 07 '12 at 07:15
  • JVM != Java compiler. Two totally separate pieces of code. – nneonneo Oct 07 '12 at 07:16
  • @nneonneo Of course, and we are talking about the other piece here... – loxxy Oct 07 '12 at 07:19
  • Cool. Just thought you should mention that not all JVMs are Oracle, or written in C++ :) – nneonneo Oct 07 '12 at 07:22
  • @EdwinDalorzo In fact I'm not aware of any *primary* evidence that the Oracle JVM is written in C++, and I've been looking on and off, mostly off, since 1997. All the evidence I've seen, including some source code, suggests C. – user207421 Oct 09 '12 at 00:29
  • Part of Hotspot is written in C, not C++. – Antimony Mar 02 '13 at 15:02
1

A compiler that is written in the language it compiles is called a bootstrapping compiler.

The way they are made is kind of a head trip, but just think: when the original language was written, there was no java, and so they had to create the compiler in another language, which, actually, was written in C/C++. Check it out, here: In which language are the Java compiler and JVM written?

Also, the way that Java works, I don't know if you know, is that the compiler (javac) actually doesn't generate machine code files, it creates bytecode files that are then interpreted by the JVM.

Community
  • 1
  • 1
alvonellos
  • 1,009
  • 1
  • 9
  • 27
  • 1
    No modern JVM does interpretation any more. Dynamic (JIT) compilation is the norm and should be assumed to be the case. (You can specify JVM flags to force interpretation, to track down whether there's JIT-related bugs. But, no normal deployment should ever use such flags.) – C. K. Young Oct 07 '12 at 07:11
  • 1
    It's my first question on stackoverflow, and I didn't expect this much reply so soon. I know how java works, and the reason I downloaded the compiler source from openjdk is that I want to develop a java compiler myself in java language for study. Thank you all. – bob zhou Oct 07 '12 at 07:17
  • @bobzhou: exciting, eh? watching a bunch of programming geeks discuss these details :P – nneonneo Oct 07 '12 at 07:18
  • @nneonneo It's a fantastic place, I'm loving it. – bob zhou Oct 07 '12 at 07:20
  • @Chris, I understand the differences between interpretation modes and JIT compilation, but the inference given by the term "compilation" is that code is executed on the machine directly, and I didn't want to give that impression. It's easy to get lost in semantics, but as far as programming languages are concerned, I believe that JIT is just an optimizing interpreter -- whose optimization step consists of **greedy** machine-compilation of bytecode -- not a compiler. – alvonellos Oct 07 '12 at 07:21
  • @Bob, if you want to begin to learn about writing a compiler, then you need to take a look at ANTLR and learning about lexers & parsers. Oh, and, you'll need to get a nice-sized bottle of Tylenol, because grammars will give you a headache -- guaranteed. – alvonellos Oct 07 '12 at 07:23
  • @alvonellos, I'm reading jvm specification, java language specification and some books about principles of compiler recently. I wanted a try if I can do it on my own. Thank you for your advice. – bob zhou Oct 07 '12 at 07:29
  • No problem. If you liked my answer, show it! Also, I'd recommend that, before you get started on writing a compiler, that you learn regular expressions and what they're about. Regular expressions are a chomsky type 3 language (regular grammars) and so it's not going to get you very far in terms of grammars, but it'll help with writing a lexer. – alvonellos Oct 07 '12 at 07:39
  • @bobzhou, in fact it was a good question,made me/others to think bit differently. – Satheesh Cheveri Oct 07 '12 at 07:58
  • 3
    @ChrisJester-Young The statement "No modern JVM does interpretation any more." is clearly false. To take a random example, HotSpot **does interpretation**. Not only that, **it does it most of the time**. The default compilation threshold is 10,000 iterations over a piece of code before it gets JITted. As the very name of the JVM says it, HotSpot detects **hot spots** in your code as JIT targets and merrily interprets the rest. – Marko Topolnik Oct 07 '12 at 08:44
  • Till the first generic is encountered of course.... And probably more recent constructs. (from 1.5 up) – Marco van de Voort Oct 07 '12 at 11:20
  • @MarkoTopolnik Thank you---between you and Jon Skeet, I am having to learn to be more nuanced in what I say. Actually, yes, I think in particular HotSpot's `-client` mode does interpretation even more frequently than `-server` mode, so of course your point holds. My main point was that, if you don't specify `-Xint`, there's (usually---as you've pointed out, not "always") much more going on under the covers that just interpretation, so I was responding to that. I would edit my earlier comment, but SO doesn't permit that. :-) – C. K. Young Oct 07 '12 at 13:18
  • 1
    @ChrisJester-Young The fairest thing to say would be that, with the advent of virtual machines employing JIT compilers, the distinction between an interpreter and a compiler is losing its conceptual ground and should be de-emphasized in discussion. – Marko Topolnik Oct 07 '12 at 13:30
  • @MarkoTopolnik Yes, JIT/dynamic compilation does indeed blur the line between (static, ahead-of-time) compilation and interpretation. It's a particular shade of grey, and I wonder if there are other shades of grey. :-) – C. K. Young Oct 07 '12 at 13:41
1

You usually need an existing Java compiler (and runtime) to bootstrap. However, there are other Java compilers available, like Jikes, that are written in C++. Whether you can use Jikes to bootstrap OpenJDK is a different story, but in theory, it should be possible.

C. K. Young
  • 219,335
  • 46
  • 382
  • 435