If python compiles to assembly and an OS is written in it, will it compete favorably with C in benchmarks?

Question

Ok, I have used the word python in the question, but it well could be language agnostic in that: If a language X has a well optimised compiler targeting assembly and an OS is written in that language, then will it compete favorably with C in benchmarks?

This comes from two conceptions I have (could be wrong):

Languages are defined by grammar rules (syntax-semantics); they themselves are independent of performance. Performance is a function of implementation.
C is often efficient because the most popular OS(es) are written in it - so limited wrapping and unwrapping - and it compiles to (these days) well optimized assembly code.

And, yes, before some of us pull out guns, I understand and appreciate that in certain situations some languages may outperform C, but by and large C does better than most languages in most situations.

It would be *really* hard to implement a complete dynamically typed language compiler which would statically eliminate all the overhead of the dynamic typing. — SK-logic, Jul 20 '12 at 17:59
Using more number of loops, nested loops, recursion, complex logic and functionality will slow down any program, written in any language. — pandu, Jul 23 '12 at 06:55
*C is often efficient because the most popular OS(es) are written in it*: I would say this is the other way round: Most popular OS(es) are written in C because it is efficient for that task. — mouviciel, May 11 '22 at 09:59
OSes (and most compiled programs) are written in machine code. If you can generate the same machine code you will of course get the same performance. — user253751, May 11 '22 at 10:03

score 19 · Answer 1 · answered Jul 20 '12 at 17:53

19

Theoretically speaking, for any idealized program there will be some set of assembly instructions that is the most efficient way to enact that program. In theory, any language could potentially be compiled to these ideal machine instructions, given a smart enough compiler.

Practically speaking, no, there is no compiler that would turn python into machine code that is as fast as an optimized C program doing the same thing. What's more, it's unlikely that such a compiler can be written, because python doesn't have primitives for low level things like pointers. In order to make the code optimally fast, the compiler would have to figure out the intentions of the code before it could translate it into low level commands, which is a ridiculously hard problem.

That's not to say it can't come close. Maybe.

answered Jul 20 '12 at 17:53

CG Morton

191
3

2

I wouldn't say it couldn't be written but it would take a tremendous effort to write an intermediary that can interpret what to do with those values, low level pointers, etc. It COULD be done but I suspect it would take a tremendous effort and compile times would be pretty slow with all of the analysis... – Rig Jul 20 '12 at 17:55
1

@Rig In effect, the complier would have to run the program with a significant amount of debugging and profiling to see how a given variable is used and then select the most optimal representation of that (is this string acting as a char*? or is it really just an enum as an index into a hash?) And then one gets a language such as [befunge](http://en.wikipedia.org/wiki/Befunge) or some parts of [perl](http://www.jeffreykegler.com/Home/perl-and-undecidability) that make it a very, very difficult problem. – Jul 20 '12 at 18:06
Difficult but not impossible :P – Rig Jul 20 '12 at 18:39
Doesn't Scala analyze the code to decide if an Object such as Int should be treated as an Object or Primitive? – Paul Jul 20 '12 at 19:43
@MichaelT: I've heard of a Scheme compiler ([Stalin Scheme](http://en.wikipedia.org/wiki/Stalin_(Scheme_implementation))) that essentially does just that. Scheme is an incredibly flexible and dynamic language, but given enough analysis and access to the whole program at compile time lets you produce *very* efficient code. Of course, Satlin was just a research project, but I imagine something like that could also be written for Python. It would, however, be prohibitively difficult to actually write. – Tikhon Jelvis Jul 20 '12 at 21:38
@TikhonJelvis You might want to poke at the perl link I gave above which shows that to determine how it is to be parsed prior to running it is equivalent to solving the halting problem. While such an approach might work for one language because of how that language is designed, it doesn’t necessary work for all. – Jul 20 '12 at 21:48

score 1 · Answer 2 · 2012-07-20T20:22:12.807

Instead of compiling to assembly, lets take a more practical approach (that do exist).

There are programs such as f2c and p2c that compile fortran and pascal to c. The question is then "can these compilers write better code than you can?" In p2c (for example), it is necessary to write additional code in C to handle string processing.

This translates to assembly too. Any time there is something that the language is doing for you, under the covers, there is the likely hood that a programmer with understanding of the structures and algorithms necessary for that would write more compact (and faster) code in C.

(edit)

Lets consider a dynamic typed language that has strings and ints, a "+" operator that is thoroughly overloaded and a print command.

var foo = 3;
var bar = "a";
var qux = bar + foo;
print qux;

The C version of this program would be:

#include <stdio.h>
#include <string.h>

int main(int argc, char **argv) {
    int foo = 3;
    char *bar = "a";
    char qux[3];
    sprintf(qux,"%s%d",bar,foo);
    printf("%s",qux);
}

(Yes, I know that isn't optimal and is a contrived example) Which compiles to the following assembly on my machine (gcc -S -O9 foo.c)

    .file   "foo.c"
    .section    .rodata.str1.1,"aMS",@progbits,1
.LC0:
    .string "a"
.LC1:
    .string "%s%d"
.LC2:
    .string "%s"
    .text
    .p2align 4,,15
.globl main
    .type   main, @function
main:
.LFB23:
    .cfi_startproc
    pushq   %rbx
    .cfi_def_cfa_offset 16
    movl    $3, %ecx
    movl    $.LC0, %edx
    movl    $.LC1, %esi
    xorl    %eax, %eax
    subq    $16, %rsp
    .cfi_def_cfa_offset 32
    movq    %rsp, %rdi
    .cfi_offset 3, -16
    call    sprintf
    movq    %rsp, %rsi
    movl    $.LC2, %edi
    xorl    %eax, %eax
    call    printf
    addq    $16, %rsp
    .cfi_def_cfa_offset 16
    popq    %rbx
    .cfi_def_cfa_offset 8
    ret
    .cfi_endproc
.LFE23:
    .size   main, .-main
    .ident  "GCC: (SUSE Linux) 4.5.1 20101208 [gcc-4_5-branch revision 167585]"
    .section    .comment.SUSE.OPTs,"MS",@progbits,1
    .string "Ospwg"
    .section    .note.GNU-stack,"",@progbits

At this point, the question is what would it take to write a compiler that would be able to analyze the dynamic code and generate that assembly. While it might be possible with that extremely limited language, once you start adding more complexity to the language, a translation into something closer to the machine requires more and more code to handle the dynamic typing, or its own runtime (objective C takes this approach) - in either case, it will be slower than something written in C that doesn't need to have that overhead.

Or, the compiler has analyzed all the execution paths of the code (for the dynamic language) which I believe is equivalent to solving the halting problem.

Someone can write "more compact (and faster) code in C." Why? — check123, Jul 20 '12 at 17:23
Also the case you present is a little tricky because such 'intermediate' compilers are yet to see maturity in generating well optimized code. — check123, Jul 20 '12 at 17:24
@check123 Consider taking a perl scalar into assembly. One would need to check if it is to be used in a string context, or a float context, or an int context, or etc... For memory management similar issues come up - could a C programmer with the understanding of the domain better manage the memory with free/malloc than a language with a garbage collection? Languages that make it easier to write code inherently do things behind the programmer's back which are not necessarily optimal for all situations compared to the C code the programmer would write to do the same. — , Jul 20 '12 at 17:33

score 1 · Answer 3 · answered Jul 20 '12 at 18:31

In general, most block-structured statically-typed languages will compile into very similar machine code (most machine architectures provide support for HLL concepts like for loops and switch statements). The problem with Python, as others have noted, is its dynamic nature. This means more run-time checks and a bigger run-time library, which means slower execution, there's just no way around that. The doesn't mean it's not possible to write an OS in it, but you're never going to get the raw performance that simpler languages like C or FORTRAN can provide.

What would be interesting to see would be "systems Python", with support for pointers and possibly with some restrictions on the language's dynamic features to permit more efficient compiled forms. I'm not familiar enough with Python to know how feasible that is, but there have been similar things done before.

score 0 · Answer 4 · answered Jul 20 '12 at 18:34

No, it's never going to happen. Even if you performed a perfect reduction from Python to machine code, there are still things like Python creates all objects on the heap and such which must be translated for the program to execute correctly which are slower than a C program placing some or many objects on the stack. In addition, there will be dynamic types and lookups which cannot be statically inferred, for example.

It's quite impossible.

score 0 · Answer 5 · answered Aug 23 '17 at 19:48

When you use Python as an example, are you considering both the language and its compiler/interpreter to produce an operating system? Or you are writing a compiler to convert your high level language into bare metal opcode?

Such high level languages are usually managed in a sense that you never handle garbage collection, pointers, physical addresses, etc. Which make it hard (for me at least) to imagine how to produce an operating system directly communicating with the machine.

Unless you are writing some high level operating system consuming services produced by some underlying hardware abstraction layer, and a physical address manager, which sounds like a smart idea; having some modular operating system partially written in C, partially in R, and the rest in Python

score 0 · Answer 6 · answered Aug 23 '17 at 22:43

Generally I don't think a compiled Python program will outperform an equivalent C program (although there might be some exceptional situations). But the reasoning might not be what you expect.

Summary

Not all compiled languages are equal regarding performance.
Mixed language between OS and application doesn't matter much.

Reasoning

You write:

This comes from two conceptions I have (could be wrong):

Languages are defined by grammar rules (syntax-semantics); they themselves are independent of performance. Performance is a function of implementation.

C is often efficient because the most popular OS(es) are written in it - so limited wrapping and unwrapping - and it compiles to (these days) well optimized assembly code.

You're not completely right.

As others already pointed out, Python's dynamic nature implies some overhead in the resulting assembly / machine instructions.

So, for some languages (e.g. Python or Java), to fulfill their semantics the resulting machine code must include additional overhead instructions that a lower-level language like C doesn't need. So languages are not "independent of performance".

You mention the relationship of application language to OS language. This matters only at the interface between application code and OS calls when it's necessary to pass parameters and data structures between application and OS. If they a really conceptually different, you might experience a slow-down of system calls because of necessary data conversions.

But as long as your program is computing and not doing I/O, it doesn't call the OS, so there's no need to convert anything, and for your application, the OS is just a library that you might call when you feel the need to do so. The rest of the time, it just occupies memory, but no CPU resources.

So why is C a good language for an OS and Python not?

The reason for C-based OSes is not to best support C-written applications, but because an OS must interact with the machine hardware on a low level, and here C can do almost everything that assembly can do (e.g. using pointers representing absolute addresses). Python's concepts don't support this low-level view of your machine that an OS must deal with.

So, for every layer of the overall software, you should choose the appropriate language (and make sure they can interact without too much pain). For an OS, C is a good choice, and for applications you can choose whatever best fits the application's needs.

And one last statement on performance: 90% of wasted-performance cases exist because of poor algorithms or poor programmer's skills, easily losing factors of 10 or 100 when compared to a proper solution, whereas a compiled language might lose a factor of maybe 3 against C.

score 0 · Answer 7 · answered May 11 '22 at 06:34

Python is intrinsically slower than C. For example, addition in C can compile into a single machine instruction, while Python needs to perform a runtime lookup to check if any of the operand types has overloaded addition.

Even the theoretical "sufficently smart compiler" would not be able to compile this overhead away in all cases, since overloads can be can be added and altered at runtime triggered by non-deterministic factors like user input.

So even with a super-optimized compiler, Python would never be as fast.

So while you are theoretically correct that performance is a function of implementation rather than the language itself, the semantics of a language may make it more or less amenable to fast code. C was deliberately designed to avoid unnecessary overhead while Python prioritize flexibility and ease of use above raw speed.

score 0 · Answer 8 · answered May 12 '22 at 00:37

0

This may be possible if you write in a restricted version of Python that can be easily compiled/optimised to machine code.

Python as it's commonly used are very dynamic, and it contains a number of constructs that are hard to optimise for a compiler.

For example, iterators are very hard to optimise as there need to be code generator for every type of iterable data types. But if you restrict your constructs to just using for-range iteration and list iteration, that can be much more easily optimised.

Once you've defined a subset of Python and wrote an optimising compiler for that subset, then you'll likely end up with something that looks very similar to RPython that's used by PyPy project. It's basically a language that feels just like writing C but with a superficially Python-like syntax.

answered May 12 '22 at 00:37

Lie Ryan

12,291
1
30
41

You don't need to restrict the language. You check at runtime. If that utterable data type used for-range iteration 10 times in a row, then you compile to one check + for-range iteration. If one the 11 time you use a different iteration, you compile code with two checks. But at that point you can't compare to C anyway, because you couldn't do this in C. – gnasher729 May 12 '22 at 09:22
You're basically describing JIT compilation. A JIT is a possible way to get faster code for dynamic language, but JIT compilation has runtime overhead for the guard validations and all the recompilation it has to do, also it makes it very easy to accidentally use inefficient constructs, and JIT won't help with short lived processes. All in all, while JIT Python like PyPy is much faster than CPython, and can get quite close to C, it's quite unlikely to "compete favorably" to well optimised C code. – Lie Ryan May 12 '22 at 11:16

gnasher729 · Answer 9 · 2022-05-12T09:20:02.123

There are JavaScript implementations using JIT compilers that come pretty close to C efficiency. Enough to run a word processor or spreadsheet in a browser. (Apple does this by switching from interpreted code to quickly compiled code to optimised code to code compiled with clang at highest optimisation).

That can be done because things that could be in principle dynamic are not in practice. Say x could have any type, but the last million times you checked it was an integer. So you compile code that checks once that it is an integer and then runs optimised code for this - if it’s a string at the billionth check then you recompile the code.

Same thing could be done for python.

PS. It doesn't matter one bit what language the OS is written in. That would affect all code running on that OS in exactly the same way.

score -1 · Answer 10 · answered Jul 20 '12 at 17:22

-1

I just read an article saying that llvm is designed to have a front-end for any language (ie Python) and has a common optimizer/generator. So you could write a front-end... So it could be as fast. (even faster if you believe HotSpot numbers you can get even faster)

Java is actually a pretty good example. It's normally byte-code, but there are several ways to compile it to native code.

What you might run into is the problems with accessing devices, memory management, interupts, and such without adding some system specific additions.

I think you could take Linux as an example. There is a bunch of code that is common across platforms. That could probably be written in anything. But there is also system/device specific code. That would need a more system level language. And yes, you could probably find a way in Python to handle that.

answered Jul 20 '12 at 17:22

Paul

730
3
13

1

Hmm.. Then does direct memory access (pointers) make C a more system-level language? In case, if a language X has support for pointers, will your answer change? – check123 Jul 20 '12 at 17:28
Java is an irrelevant example - it is a statically typed language. – SK-logic Jul 20 '12 at 18:12
@SK-logic: One might argue that Java is more of a dynamic language when it comes to its implementation though, given how it uses an intermediate interpreter and JIT compiler. – hugomg Jul 20 '12 at 19:32
@check123 yes, if the obstacles could be overcome. I don't think Python would need to have direct pointers as implied since it handles that behind the scenes. Memory management in this case is MMU/virtual memory space type work. – Paul Jul 20 '12 at 19:45
@SK-logic The point about java is that it is possible to improve performance with modern technologies.... – Paul Jul 20 '12 at 19:46

score -6 · Answer 11 · answered Jul 20 '12 at 17:23

-6

C compiles to machine, not assembly. If python, or any other language, also compiled to the machine codes, then it would be just as fast (this also depends on how well the compiler optimizes, though).

answered Jul 20 '12 at 17:23

Travis

1

Thanks for the down vote. Now please tell me why my answer is wrong. I'm always up for learning something, and avoiding giving a bad answer in the future. – Travis Jul 20 '12 at 17:42
2

Just because something compiles to assembly does not mean that the assembly written is optimal - even with a compiler that can optimize it. There is additional information in a statically typed language (this is an int, this is a char*) that would require additional code around it to coerce the data from one type to another as necessary. – Jul 20 '12 at 18:02
Ignoring some of the misconceptions in the OP, the question is "if python were compiled to machine instructions directly executable by the cpu (like C), would it match C's performance?". The simple answer is: Yes. – Travis Jul 20 '12 at 18:11
3

@Travis, assembly language is a syntax sugar for the machine code, you cannot oppose one to another. So your answer is just wrong. And needless to mention that the "Yes" part of your answer is pretty wrong too. – SK-logic Jul 20 '12 at 18:14
A cpu can't understand assembly, that's why it is compiled as well. And there is not _always_ a 1:1 correspondence between assembly and machine instructions. [link](http://programmers.stackexchange.com/questions/156613/is-this-an-assembly-language) – Travis Jul 20 '12 at 18:28
@Travis, wrong again. Assembly is a *representation* of a machine code. With some metadata added. And, you probably would be surprised, but a majority of the C compilers are generating a readable assembly (which is further processed by an assembler), not machine code. – SK-logic Jul 20 '12 at 18:58
3

"The simple answer is: Yes.": Then why is it that nobody took the time to write such a compiler? All Python programmers would profit from it. – Giorgio Jul 20 '12 at 19:36
They already do profit, because python itself is not written in python. It is written in C, and thus already compiled to machine code. Now I'm way off topic. So I'll concede defeat: there must just be some fundamental property of every language other than C that makes it impossible to compile directly to cpu instructions. I know, it's a mystery of science. – Travis Jul 20 '12 at 20:17
6

Travis, you don't understand. Just because CPU instructions are the outcome does not mean that they execute in a reasonable time, at all. There are *numerous* properties of Python which do not exist in C which would severely affect the speed of the resulting instructions. There probably *isn't* a reason why Python cannot compile to machine code. It just still wouldn't be anywhere near as fast as C. – DeadMG Jul 20 '12 at 20:37
It must be that I don't understand. Because the question asked already assumes a "well optimized compiler targeting assembly". He has already jumped to theory, bypassing everyone's attempt at explaining why, practically, this can't be done. And I'm responding directly to the theoretical. I don't care about the practical. Edit: this is why my answer is right. I'm confused as to why everyone else is confused. – Travis Jul 20 '12 at 20:42
The point is that it is not clear that a "well optimized compiler" can be done, even in theory (there are limitations about what can be done algorithmically) and certainly in practice. – Andrea Jul 20 '12 at 21:09
@Travis: To put it very roughly: Python's semantics is further from machine code than C's semantics. Therefore a compiler can map C to machine code with less overhead. That's why compiling C to machine code you can get a faster binary than by compiling Python code. – Giorgio Jul 20 '12 at 21:33
@Travis, wrong again, there is a Python written in Python (and compiling it into native code at the same time): PyPy. – SK-logic Jul 20 '12 at 21:37
+"If python, or any other language, also compiled to the machine codes, then it would be just as fast (this also depends on how well the compiler optimizes, though).": You can compile Python to machine code, but the compiled code must still simulate the Python machine which is more abstract than the C machine and therefore involves more overhead. – Giorgio Jul 20 '12 at 21:50

If python compiles to assembly and an OS is written in it, will it compete favorably with C in benchmarks?

11 Answers11