How do hybrid interpreter-JIT compilers work?

Question

Chrome's V8 compiler, the Java HotSpot compiler, and many more have multiple tiers of interpretation and compilation.

A function starts off as interpreted in HotSpot and then, if it is run often enough, it is compiled to native code (and later optimized). How does the interface between such layers work?

For example, if you have 2 functions: Function A and Function B. If Function A is compiled and calls Function B, which is interpreted at the time Function A is compiled, Function A simply has a few instructions to trigger the interpreter for Function B, right?

But, what happens if, at some point after Function A is compiled, Function B is compiled too? Then, Function A will end up calling an interpreted version of Function B despite Function B having a faster, compiled, version.

So, what happens in modern compilers? Do they hot patch Function A to point to the new code for Function B? Do they add a layer of indirection and hot patch that instead of patching every single function that references Function B? Or something else entirely?

The only material difference between an interpreter and a JIT compiler is that the interpreter doesn't cache its results. Most interpreters still use some intermediate form; that form might as well be the same byte code that a JIT compiler uses. — Robert Harvey, May 24 '15 at 05:09
Sorry, I suppose I didn't explain. I'm referring to an interpreter as something that reads intermediate code and a JIT as something that creates/executes native code for the platform. So, the interpreter reads intermediate code and the JIT turns it into native platform code (e.g. x86 code). So, my definition of the difference is that a JIT writes native code for the platform and an interpreter executes either source code or intermediate code. Anyways, how do you interface between 2 separate systems in a fast way. — Colorfully Monochrome, May 24 '15 at 05:14
One of the advantages of using intermediate code is that *it doesn't matter.* Whether you're JITting the intermediate code or interpreting it, *it's still the same intermediate code.* It's exactly the same interface either way. — Robert Harvey, May 24 '15 at 05:16
That much I get. However, what happens after one function is compiled to native code? The JIT is executing native code and needs to call a function that will be interpreted (but the interpreter is directly executing intermediate code). Then we can have the original emission of code call the interpreter, which executes the intermediate code. But what happens when the code from the interpreter is then compiled to x86 native? That was my issue: how do I deal with a change in which system is executing the code. — Colorfully Monochrome, May 24 '15 at 05:20
(2/2) So, I suppose my issue is less with intermediate code, but more with dealing with interfaces between two systems in a fast manner. — Colorfully Monochrome, May 24 '15 at 05:21
There aren't two systems; there's only one... ultimately, it all resolves to X86 code. The JIT is not executing native code; the sole purpose of the JIT is to compile intermediate code to native code. Interpreters don't compile code to X86 native, generally speaking. They call functions that execute the necessary instructions via the intermediate representation. You can still have a single intermediate representation (should you choose to design it that way). — Robert Harvey, May 24 '15 at 05:23
Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/24073/discussion-between-colorfully-monochrome-and-robert-harvey). — Colorfully Monochrome, May 24 '15 at 05:25
Note: V8, at least in its original incarnation(s) did not have an interpreter. The original versions(s) had just one compiler, newer versions have two, but AFAIK no interpreter. — Jörg W Mittag, May 24 '15 at 07:42
@JörgWMittag V8 doesn't have an interpreter? Must have been thinking of the stage 1 compiler... So, the stage 1 compiler is non-optimizing and the stage 2 one is the optimizing one? — Colorfully Monochrome, May 24 '15 at 08:00
@ColorfullyMonochrome: Yes. The stage 1 compiler is, well, at least *less* optimizing, but fast and small, and it injects profiling code into the compiled machine code. The stage 2 compiler is more optimizing, but slower and needs more memory, and it uses the collected profiling data. The original version (pre-Crankshaft) had just one moderately optimizing, fast and small compiler. Note that this is not an unusual design, the Maxine Research JVM and the JRockit JVM, for example, also have no interpreter. — Jörg W Mittag, May 24 '15 at 09:13
Have a look at this article. http://arstechnica.com/information-technology/2014/05/apple-integrates-llvm-compiler-to-boost-webkit-javascript-performance/ Javascript starts interpreted, and goes through three different compilers. — gnasher729, May 24 '15 at 19:24

Brendan · Accepted Answer · 2015-05-24T08:40:34.753

You can split any code into pieces consisting of a linear sequence of steps that are separated by control flow changes (branches, function calls, returns, etc).

To do JIT compiling, you'd start immediately after a change in control flow, scan ahead until you find the next control flow change, convert the linear sequence of steps into native code, optimise the native code, place a "terminating control flow change" at the end of it, then store the result (lets call it "a trace") in some sort of "trace cache" and execute the trace. If there's no trace in the cache for the next piece of code then that "terminating control flow change" can pass control flow back to the virtual machine. If there is a trace for the next piece of code then that "terminating control flow change" can pass control directly (or indirectly) to the next trace. Note: I am over-simplifying here - that "terminating control flow change" may have 2 or more destinations (e.g. it may be a branch, where one destination is the "true" path and the other is the "false" path), and may need assistance from the virtual machine even if the next trace is in the cache.

The problem here is that it's all expensive (and the harder you try to optimise the native code being generated the more expensive it gets). If the sequence of code is run often then the overhead of JIT compiling becomes insignificant and there's a performance gain. If the sequence of code is not run often then the cost of JIT compiling can easily exceed the benefits.

Now; if there's no trace for the next piece of code then that "terminating control flow change" can pass control flow back to the virtual machine. Instead of JIT compiling, nothing prevents the virtual machine from interpreting at this point. If you decide to do JIT compiling you can patch the previous trace's "terminating control flow change" so that it points to the new trace that you generate. If you decide to interpret instead; then that's fine - immediately after the interpreter interprets a change in control flow it checks the trace cache to see if there's an "already native" trace for the next piece. If there is it can execute the already compiled trace, and if there isn't it can just keep interpreting or decide to switch to JIT compiling.

The only other thing you'd need to do is decide when to use JIT and when to interpret. Typically there are a few hints you can use to infer that JIT is likely to be worthwhile (specifically, taken conditional branches where the destination is at a lower address typically indicate loops, and loops are good candidates for JIT compiling). Beyond that you'd need to maintain some statistics (like the number of times something has been interpreted) and use them as the basis for the "JIT or interpret" decision.

How do hybrid interpreter-JIT compilers work?

1 Answers1