6

The definition of interpretation (correct me if I'm wrong) is parsing code like so:

1- Translate currently parsed line to some intermediate language. 2- Run the translated line. 3- Move to the next line. 4- Translate it. 5- Run it. And so on.

If so, why do people say that the JVM interprets the Java Bytecode? What it does is execute it. The Bytecode is already translated Java source code, so there's no need to translate further.

Than why is the term interpretation involved?

If this term is used because the JVM executes the Bytecode line by line, well I'd say that any execution does that. Executes the runnable code from top to bottom. Than why use the term interpretation?

Interpretation as opposed to what?

NPElover
  • 181
  • 1
  • 2
  • 11
  • 3
    semantics mostly, it's largely to differentiate interpretation and JITted execution – ratchet freak Mar 08 '14 at 17:21
  • @ratchetfreak Which brings me to another question. JIT execution means to compile each line **during** execution, correct? If so, how is it different than interpretation? – NPElover Mar 08 '14 at 17:23
  • it's compiled to machine code and then called like a normal function – ratchet freak Mar 08 '14 at 17:28
  • 2
    _"people say..."_ - who says so? [Wikipedia article](http://en.wikipedia.org/wiki/Java_virtual_machine) states it pretty straightforward "Java Virtual Machine (JVM) is a process virtual machine that can **execute** Java bytecode..." A [Wikipedia article](http://en.wikipedia.org/wiki/Java_%28programming_language%29) about Java language says in addition: "Java applications are typically **compiled** to bytecode..." – gnat Mar 08 '14 at 19:46
  • That's an implementation detail that you're not supposed to care about. – Blrfl Mar 08 '14 at 21:46
  • 3
    Most people think Java bytecode is executed by software or a JVM, but there are some ARM CPU processors that can run bytecode natively. It all depends upon where you run it. Bytecode by itself is not executable. Think of it as highly compressed source code. – Reactgular Mar 09 '14 at 00:08
  • 2
    *"Bytecode by itself is not executable."* - Nothing by itself is executable. Everything requires an execution platform. – Stephen C Mar 10 '14 at 02:57
  • *"Think of it as highly compressed source code."* - Poor analogy. You could argue that it is like lossy compression, but even then, the "compression" is not designed to allow "decompression" to source code. The fact that it is semi-possible is not part of the design goals. – Stephen C Mar 10 '14 at 03:15
  • Where did you get this crappy definition? – SK-logic Mar 10 '14 at 10:24
  • 1
    And before you carry on wasting your time on your futile attempts to draw a line between compilation and interpretation, first think about the Futamura projections. – SK-logic Mar 10 '14 at 10:32

6 Answers6

10

A processor executes machine instructions. They are the only thing that a processor understands.

Java bytecode is an intermediate, compact, way of representing a series of operations (for want of a better term). The processor can't execute these directly.

The Java Virtual Machine processes that stream of bytecode operations and interprets them into a series of machine instructions for the processor to execute.

This is a very similar process to the one the Python interpreter performs on an input Python script. It doesn't matter that Java Bytecode is a binary format to help speed up execution, and a Python script is a text file. Even Python processes .py modules into a binary .pyc form for the same reason. The more fundamental difference is that Java bytecode has been through a more thorough pre-processing step - resolving method calls on objects, checking types etc., so that the JVM doesn't then have to do them.

This VM process is what allows compiled Java to be portable to any computer, whether it has an X86 processor, ARM or anything else - as long as there is a JVM for that platform. The different versions of the JVM all interpret the same bytecode, but produce the appropriate machine instructions for their processor.

For further detail, and an excellent explanation on the difference between a Virtual Machine like Java and an interpreter like CPython see the accepted answer here:

https://stackoverflow.com/questions/441824/java-virtual-machine-vs-python-interpreter-parlance

WillW
  • 281
  • 2
  • 5
4

The definition of interpretation (correct me if I'm wrong) is parsing code like so:

  1. Translate currently parsed line to some intermediate language.
  2. Run the translated line.
  3. Move to the next line.
  4. Translate it.
  5. Run it. And so on.

That is not a valid definition. It describes just one way of implementing an interpreter, but it excludes other ways. Most interpreters do not operate a line at a time like that. It is more normal to translate from source code to an intermediate form ahead of actually trying to execute the intermediate form.

And even the phrase "intermediate form" is potentially incorrect, in that it suggests that there is some other form for the code after the intermediate form.

If so, why do people say that the JVM interprets the Java Bytecode? What it does is execute it.

Interpretation is a form of execution. So there is no contradiction.

Interpretation as opposed to what?

As opposed to compiling the bytecodes to native code and executing the native code. Which is what typically happens in typical Java implementations ... one way or another.

In a typical JVM, the bytecodes are interpreted to start with. After a bit the JIT compiler compiles them to native code that can be executed directly. Other modes include:

  • Interpreting bytecode all of the time; e.g. using java -int
  • Ahead of time compilation; e.g. using gcj.
  • Load-time compilation ... in which the bytecodes are compiled to native code as they are loaded by the classloader; e.g. JNode does this.

FOLLOWUP QUESTION

JIT execution means to compile each line during execution, correct? If so, how is it different than interpretation?

Firstly, no that is not correct:

  • There is no such thing as "JIT execution". When people say that, they really mean "JIT compilation and execution" ... and they are not spelling it out.

  • JIT compilation is not done a line at a time. It is typically done a method at a time.

And the difference is in what it is doing. The JIT compiler's job it to compile bytecodes to native code. It does not execute the bytecodes. Execution of the bytecodes is done by either interpreting the bytecodes (before they have been JIT compiled) or executing the native code that the JIT compiler has produced.

Stephen C
  • 25,180
  • 6
  • 64
  • 87
  • @DougM - I know there is a difference. "'A' is a form of 'B'" does not mean "there is no difference". And I'm not saying that you can't define terms. I suggest that you read what I read again ... more carefully. For the purpose of the "scree", look at the context for the question; i.e. the *series* of questions like this that the OP has been asking. Finally, if you have a problem with people using italics, perhaps you should take it up on "meta". – Stephen C Mar 09 '14 at 22:59
  • Comments are to suggest improvements. To be clear: your use of italics for emphasis adds no clarity, and I believe you spent too much of your answer harping on the imprecision of language. – DougM Mar 10 '14 at 01:59
  • @DougM - Your opinion has been noted. – Stephen C Mar 10 '14 at 02:53
1

It depends on the implementation.

Java bytecode isn't the same thing as machine code. It's a very similar format to that used by most architectures, which makes it relatively easy to translate to machine code, but I'm not aware of any machines that actually run it natively.

The Java specs don't care how the code gets executed, as long as the results are what the spec says they should be. But if you're not executing it natively, then you have to translate it in some way. You can do this with interpretation, and that's how some early JVMs worked, but this technique has fallen out of favor in more recent years. Nowadays, it's more common to compile it to native code, either ahead-of-time or just-in-time, and then execute that instead of the bytecode.

The Spooniest
  • 2,160
  • 12
  • 9
  • 1
    Modern JVMs still work as an interpreter. It is best to understand JIT compilation as an optimization, not as a primary mode of operation. The point is that interpreting some code is generally faster than JITting the code and then executing it, if the code is only executed very few times (or not at all). The cost of compiling some part of the code amortizes itself for critical sections like tight loops – so called hot spots – which are usually compiled only after a profiling the code while it's being interpreted. – amon Mar 08 '14 at 20:41
  • *Optimizing* compilers are expensive, stupid ones are not. Instead of interpreting, you can also compile the code with a stupid non-optimizing compiler first, then recompile hotspots with the expensive memory-hungry CPU-consuming aggressively optimizing compiler. That's how early versions of Maxine worked, for example, or BEA/Oracle JRockit. I think, V8 also does this. This simplifies your design, because you no longer have two entirely different modes of operation, only one, with different levels of optimization. – Jörg W Mittag Mar 08 '14 at 21:49
1

Interpreting: This means you take a tiny piece of code, figure out what it does and then do it. The tiny piece of code may be "plain text" source code, but it may also be some sort of pre-tokenised code, or some sort of byte-code, or native instructions for a different CPU. The important difference here is that for interpreting its never converted into native code for the CPU it's running on.

Compiling: This means you convert the source code into native code. When most people think of "compilers" they think of a tool that converts plain text source code into native code (known as "ahead of time compiling" because it's compiled before it's run - often even before the end-user receives it), but this isn't the only way to compile. The main benefit of ahead of time compiling is that the compiler can spend a lot of time optimising the code really well without the end-user caring how long it takes.

JIT Compiling: This is one of the different cases of compiling (note: JIT is "Just In Time", and different people will call this dynamic translation instead). In this case you wait until a small piece of code needs to be executed, then compile that small piece into native code and execute it; then you go looking for the next small piece of code. By caching the previously compiled small pieces you end up compiling most of the program while the program is being executed. For code that is executed often this is a lot faster than interpreting; but for code that is only run once it's slower than interpreting.

For Java, first the plain text source code is compiled into byte-code using an ahead of time compiler. This is where a lot of optimisation happens. After that the byte-code is executed by a Java Virtual Machine (JVM).

Old implementations of JVMs used to interpret the byte-code (which was slower). Modern JVMs interpret byte-code that isn't executed very often (to avoid the overhead of compiling when it's not justified) and also do JIT compiling for code that is executed often (to get close to native speed for those more important parts).

Basically (for modern implementations), it's not one or the other - it's "ahead of time" followed by a mixture of interpreted and JIT compiling.

Brendan
  • 3,895
  • 21
  • 21
1

The definition of interpretation (correct me if I'm wrong) is parsing code like so:

1- Translate currently parsed line to some intermediate language. 2- Run the translated line. 3- Move to the next line. 4- Translate it. 5- Run it. And so on.

I would say your definition of "interpretation" is wrong -- there is not (necessarily) a translation step in interpretation. So it should be:

  1. Read a chunk of code
  2. Execute that chunk of code
  3. Read the next chunk of code
  4. Execute the next chunk
  5. ...keep going

In the case of a JVM bytecode interpreter, a "chunk" is a single bytecode. In a more traditional interpreted language, a "chunk" might be a line. The 'execute' step might involve generating some IL for the chunk and then interpreting that IL -- in which case you have two interpreters, a high level one that invokes the lower level one.

Chris Dodd
  • 276
  • 2
  • 2
0

Java bytecode is a language. Languages aren't interpreted or compiled. They just are. Interpretation and compilation are traits of, well, the interpreter or compiler (duh!), i.e. an implementation of that language.

There are interpreted implementations of the JVM (e.g. early versions of the Sun JVM before Java 1.2), there are compiled implementations of the JVM (e.g. early versions of Maxine), but the most commonly used implementation, the Sun/Oracle HotSpot JVM is actually both: it first interprets the bytecode, gathering statistics, and then it compiles the performance-critical portions using optimization information derived from those statistics.

Jörg W Mittag
  • 101,921
  • 24
  • 218
  • 318
  • when people say that code is `*interpreted*`, what do they **most commonly** mean? I never know if they mean **1-** *'Each line of code is translated by the interpreter to another language, and then executed by the interpreter'*, or **2-** *'Each line of code is executed by the interpreter (without translation)'*. – NPElover Mar 08 '14 at 21:13
  • And when you say that the JVM *'first interprets the bytecode'*, what of the two above options do you mean? Is there another process of translation occuring? (The bytecode is already translated Java source code). Or is the interpreter in the JVM directly execute the bytecode? – NPElover Mar 08 '14 at 21:14
  • I mean option 2. That's the *definition* of interpretation. If you translate it, then it's not interpretation, it's compilation, because the definition of compilation is "translating a program from one language to another language, preserving its meaning." There are plenty of systems which do both, for example the YARV Ruby execution engine first compiles Ruby to YARV bytecode, then interprets that bytecode. Note also, that you shouldn't think of "lines of code". Bytecode doesn't have "lines". It has "bytes", that's why it's called byte code. It's not a textual format. – Jörg W Mittag Mar 08 '14 at 21:35
  • And even for textual languages, such as Ruby, the interpreter doesn't interpret source code directly, because syntactic constructs in source code don't necessarily correspond directly to semantic constructs in the language. Rather, the code is typically parsed into a parse tree, then the parse tree is semantically analyzed to yield an abstract syntax tree, and then *that* is interpreted. Simple BASIC interpreters actually *do* interpret line-by-line, because BASIC is defined in such a way that a single line always corresponds directly to a single statement. The canonical Perl implementation … – Jörg W Mittag Mar 08 '14 at 21:38
  • … actually uses a directed graph of opcodes and interprets *that* (and you can certainly argue whether constructing that graph out of Perl source code already constitutes compilation or can still be considered parsing). – Jörg W Mittag Mar 08 '14 at 21:39
  • It's really confusing how people have different definitions of things. For example this site (http://www.engineersgarage.com/contribution/difference-between-compiler-and-interpreter) defines interpretation and compilation as: 'A Compiler and Interpreter both carry out the same purpose – convert a high level language instructions into the binary form which is understandable by computer hardware'. I've been trying for days to understand what interpretation **usually** means, so far no luck. Just to be sure, In your opinion, 'interpretation' is pretty much another name for 'execution', correct? – NPElover Mar 08 '14 at 21:42
  • However, practically speaking, the distinction is irrelevant. For you as a programmer, what matters is the semantics of the language, not how exactly a particular implementation may or may not execute your code. Heck, sometimes you can't even tell the difference: a program which quickly compiles snippets of code and then directly executes them sure *looks* and *feels* like an interpreter, even though it isn't. And likewise, bundling an interpreter together with the to-be-interpreted code into a single binary, is indistinguishable from a compiled binary. – Jörg W Mittag Mar 08 '14 at 21:42
  • Compilation: look at a piece and *generate* a piece of code that is semantically equivalent. Interpretation: look at a piece of code and *run* a piece of code that is semantically equivalent. An interpreter doesn't translate or "convert". It executes. A compiler "converts". A CPU is just an interpreter for machine code, an interpreter doesn't convert *to* machine code, it *runs* a piece of machine code (or any other kind of code, really) that is semantically equivalent to the code being interpreted. – Jörg W Mittag Mar 08 '14 at 21:45
  • Oh, I see. So basically: An interpreter sees a piece of code, comes up with a new piece of code - with the same meaning - and **executes** it. A compiler sees a piece of code, comes up with a new piece of code - with the same meaning - and **stores** it (later to be executed). Is this distinction accurate? – NPElover Mar 08 '14 at 21:52
  • Yes, that's it. Note also that you have accidentally discovered something really important: compilers and interpreters are almost the same thing. Yet, somehow, writing an interpreter is considered a fun exercise, whereas writing a compiler is considered deep black magic. Yes, writing an industrial-strength, production-quality, high-performance compiler *is* deep black magic, but so is writing an industrial-strength, production-quality, high-performance interpreter, or really *any* industrial-strength, production-quality, high-performance software. – Jörg W Mittag Mar 08 '14 at 21:58
  • Let me see if I understand something. 'We agreed' that saying: "An interpreter executes a 'line' of code", is equivalent to saying: "An interpreter converts a line of code to another line of code with the same meaning, and then **executes it**". I think that the part that I made in boldface actually means: "Tell underlying platform to execute this line of code." And in the underlying platform (e.g. JVM or CPU), "execute this line of code" means another interpretation. And as we go more and more low level, more and more interpretations happen. Is this correct or am I totally wrong here? – NPElover Mar 08 '14 at 22:11
  • @NPElover, why do you care what "people" "commonly" mean? Majority of the so called "people" is totally incompetent, therefore the "common" understanding of a term does not really matter at all, unless you're doing some kind of a weird anthropological study. – SK-logic Mar 10 '14 at 10:29