4

It occurs to me that there's not a heck of a lot of difference between

$>python module.py

And:

$>javac module.java
$>java module.class

The former compiles to an intermediate language (python bytecode) and executes the program. The latter breaks the steps up, first compiling to the intermediate language (jvm bytecode) and then executing on another line. In fact I can rewrite the python to break out the two steps, as in this SO question.

It seems people make a big deal about the stark difference between compiled and interpreted languages. It seems the lines are entirely blured. Other popular interpreted languages havi similar features. Php "compiles" to its own opcodes which can be cached or stored for later use. Perl also is compiled to something.

So... is there really any difference between these popular interpreted languages and popular compiled languages that compile to VMs? Perhaps in one case the VMs are typically more memory resident whereas with the "interpreted" languages they typically have their runtimes spun up? Yet this seems like it could be easily changed.

Yet there still seems to be something of a difference. If they are more-or-less the same, then why is it that the performance of Java/C# seems to approach C++ while the "interpreted" languages are still an order of magnitude off? If its all truly bytecodes running in a VM, and all really the same, why the big difference in performance?

Doug T.
  • 11,642
  • 5
  • 43
  • 69
  • 3
    IMHO, Java and Python are (almost) in the same category, both interpreted with bytecodes, except that Java has a JIT by default. The real comparison between Interpreted×Compiled would be C and shell script. – marcus Jul 18 '12 at 14:57
  • 1
    Python is *dynamically* typed. It is the real cause of such a performance gap. Otherwise it would have been trivial to implement an efficient JIT compiler. – SK-logic Jul 18 '12 at 15:13
  • I'm with @marcus. If Python gets compiled to python bytecode, it doesn't sounds like an interpreted language to me! – vaughandroid Jul 18 '12 at 15:52
  • This question has been asked many times before in various forms. For example: http://programmers.stackexchange.com/questions/136993/interpreted-vs-compiled-a-useful-distinction – david.pfx Jul 26 '14 at 10:49

2 Answers2

6

There are many differences. First of all, think of the difference between a bytecode interpreter and a language interpreter. It's easy to interpret bytecode, because all commands follow a predictable format, but interpreting a language involves parsing and lexing -- operations that can be quite taxing, depending on the language.

C# and Java don't only compile to bytecode. They also use JIT compilation, which allows them to interpert the bytecode a few times for the entire duration of the application, and caching the result -- instead of interpreting it whenever the execution thread stumbles upon it, which would involve lots of redundancy.

As far as I remember, python can compile to bytecode, but it doesn't use JIT, which can drastically increase performance.

GregRos
  • 1,723
  • 1
  • 14
  • 27
  • 2
    pypy includes a JIT for python. As does the python library psyco – Timothy Baldridge Jul 18 '12 at 14:54
  • Does it perform as well as JVM/CLR JITter? – GregRos Jul 18 '12 at 15:57
  • Bytecode requires parsing as well, unless it's in a fixed-size format. – Mason Wheeler Jul 18 '12 at 16:16
  • @GregRos In simple cases (i.e. benchmarks), yes. In real-world apps, it's hard to compare, but it can be a couple of times faster than CPython. I guess it's not head-to-head with JVMs and the CLR, but that's hardly surprising since the problem is significantly harder and the budget (and in case of JVM, development time) is much smaller. –  Jul 18 '12 at 17:56
  • @MasonWheeler Yes, but many bytecode formats are fixed-size, and those that aren't are trivially "parsed" with a loop and a switch statement. It's multiple orders of magnitude easier than parsing a fully-blown programming language. –  Jul 18 '12 at 17:58
0

Typically, interpreted languages aren't eligible for full-on static analysis (e.g. static type checking across multiple modules) and optimization. Compiling to bytecode can provide this. OTOH, interpreted languages can run even if some parts (libraries, etc.) are missing, because you're late-binding references. IOW, if you don't actually call the code, it doesn't matter if it's there or not.

TMN
  • 11,313
  • 1
  • 21
  • 31
  • Actually, *dynamic-ness* (most prominently dynamic typing) makes static analysis hard, but that should be a dead giveaway just from looking at the terms. A language is amendable to analysis or not, how it's implemented is of no interest. Likewise, when a language is just hard to analyze statically, compiling to equally hard-to-analyze bytecode won't help. –  Jul 18 '12 at 18:01