18

I am taking an introductory course on python and the instructor says that python is a high level language and C and C++ are low level languages. It's just confusing. I thought that C, C++, Python, Java, etc were all high level languages.

I was reading questions at stackoverflow on C, C++, etc and they all seem to refer to those languages as high level. It seems to me that some programmers use those terms interchangeably.

ratchet freak
  • 25,706
  • 2
  • 62
  • 97
atheistlearner
  • 309
  • 2
  • 5
  • 1
    Like many things, high vs. low level is a simplification - useful for understanding, but potentially misleading if you forget that it's a simplification. What level is certainly relative, as others have said. But it's not necessarily a line - there's different directions that you can abstract in (e.g. different paradigms). Just because you're moving further from the machine abstraction doesn't necessarily mean you're moving towards an appropriate abstraction for your application. –  May 05 '13 at 00:36
  • Even the starting point may vary. For example, IMO the lambda calculus is a very low level of abstraction - plenty abstracted in the machine, but it's a very simple abstraction that serves as the starting point for functional languages to start building abstractions on top of. In any case, lambda calculus is likely no closer to the ideal abstraction for any particular application than machine code. –  May 05 '13 at 00:40

4 Answers4

33

High level and low level are relative terms so the usage has changed over time. In the 70s UNIX made waves because it showed that an operating system could be written primarily in a high level language: C. At the time C was considered high level as in contrast to assembler.

Nowadays C is considered a low level language because neither the language nor the standard libraries provide any of the bread and butter data structures like vectors, dictionaries, iterators, and so on. You can have all those structures in a C program, but you'll end up writing them yourself. Python, Java, etc. are high level relative to C because many of those standard data structures are built in to the language or are part of the standard libraries. Having those right out of the box makes it easier to program at a more abstract level.

C is low level in a 2nd sense: it enables direct manipulation of the computer hardware (at least as direct as the OS will allow). The most common implementations of Python, Java, etc. are at least one step further removed from the hardware because they run in a VM. If you want to manipulate the hardware from Python you'll have write an extension to the Python VM, usually in C or C++.

C++ is an odd case. It provides tons of nice data structures as part of the standard library, but it also allows low-level manipulation of the hardware.

Charles E. Grant
  • 16,612
  • 1
  • 46
  • 73
  • 3
    C++ isn't really that odd a case, IMO - it's simply a mixed-level language. The level of abstraction you get depends on which features you use. –  May 05 '13 at 00:27
  • 1
    @Steve314: Yes and no: normally abstraction comes with information hiding, i.e. a language or a library is like a black box that provides an interface, and no one wants to know what is inside the black box. C++ is a bit odd in this because it offers higher-level constructs but does not prevent the programmer from accessing their representation and breaking them. C++ is the only language I know of that does not isolate different abstraction layers (but maybe there are other languages I do not know of). – Giorgio May 05 '13 at 15:18
  • 1
    @Giorgio - C++ allows you to hide *any* implementation detail - e.g. make it part of the private internals of a class so the only official way to use it is via the public interface of that class. Of course you can break the rules and scramble your memory all you want - but in practice you can do that in *any* language that supports real-world application development. –  May 05 '13 at 15:53
  • @Giorgio - Take for example Haskell. "Unsafe" in that case tends to mean not-referentially-transparent (as in `unsafePerformIO`). There are `IORef` types, but there's no equivalent of `reinterpret_cast` I know of, and no equivalent of pointer arithmetic. But that doesn't mean it's safe from people hacking around with memory. In order to be a practical language, Haskell has to interface with real-world operating systems and libraries. It has a "foreign function interface". If I really want to subvert it, all I need do is use the FFI to write the primitive subversion functions. –  May 05 '13 at 15:58
  • @Giorgio - Of course I may have a hard time *finding* the values I want to corrupt in memory, but the same can apply in C++, depending on how well I've hidden them. For example, I might use a [PIMPL](http://www.gamedev.net/page/resources/_/technical/general-programming/the-c-pimpl-r1794). If I then only provide the object code and header for the library that understands what that points to, the would-be subversive has to reverse-engineer that object code to figure out what to subvert and how. –  May 05 '13 at 16:01
  • @Giorgio - in short, some languages make it a lot harder to break abstractions, but it's never impossible - it's not really what's meant by the [law of leaky abstractions](http://www.joelonsoftware.com/articles/LeakyAbstractions.html), but it's a variant of the same thing. –  May 05 '13 at 16:08
  • @Steve314: Then one could say that if language X makes it harder to break some abstraction A than language Y, then X is higher-level than Y with respect to A (provided that both languages offer the same abstraction A). – Giorgio May 07 '13 at 12:49
  • @Giorgio - IMO the height of an abstraction and its leakiness are two different things. Emulators are a common example of a low-level but (hopefully) non-leaky abstraction. The abstraction is low level because it's still a machine, just not the same machine it's running on - abstracting sideways (or even arguably downwards, to a simpler machine model) rather than upwards. In fact to be air-tight, an abstraction must be trivial (all non-trivial abstractions leak) and therefore cannot be significantly higher level than the abstraction it's built upon. –  May 07 '13 at 13:36
  • @Steve314: "IMO the height of an abstraction and its leakiness are two different things": To abstract means to "take away" (from Latin abs-trahere) the details that are not essential and considered noise. When using a leaky abstraction (or leakier, if you want to treat it as a relative notion), a programmer must consider more details about the underlying machine because these details have not been hidden properly (have not been abstracted away). – Giorgio May 08 '13 at 23:34
  • So abstraction consists in (1) building a new structure (API, network protocol, memory model, database model, whatever) on top of existing ones, and (2) hiding the implementation so that the user can access the abstraction without seeing its implementation (background noise). If you leave out step 2 (less encapsulation = more leakiness), you have less abstraction. – Giorgio May 08 '13 at 23:37
  • @Giorgio - If all else is equal, of course the leakier abstraction is lower-level, but IMO that only works for comparing otherwise-same-level abstractions. I consider how different the new abstraction is from the underlying one (and how close it is to what the application needs) to be the key point. By "two different things" I didn't mean to imply there's no relationship. In fact I stated a relationship myself - an abstraction cannot be both leak-free and significantly higher-level than the abstraction it's built on - but that doesn't mean one fully (or even mostly) determines the other. –  May 09 '13 at 01:59
8

Think of this in terms of a sliding scale, from LOW-level languages all the way through to HIGH-level languages. As a language moves up the scale, from LOW to HIGH, the language provides more and more abstraction from the specific interface with the computer.

LOW-level languages are written to explicitly direct the computer - think machine code and assembly code.

HIGH-level languages attempt to abstract away the nitty-gritty details (particularly memory allocation and release of memory). The idea is to provide a more "natural" interface to programming and hopefully allow the programmer to focus on design and production.

These days, C is regarded as a LOW-level language. It still has some significant abstractions from machine code and assembly code, so is technically 'higher' than these. However, it does still provide direct memory addressing and not provide garbage collection. So these are details a programmer must design for.

Compare this to other languages such as Python, Ruby or Haskell and you have a much more obscure interface. These languages have large libraries of code that abstract away most of the computer command. Ever wondered what happens to a variable in Python when you leave the local scope of a function, or delete it? Probably haven't right? And that is because in a HIGH-level language you don't have to! They look after the memory allocation / release for you.

HIGH-level languages have the advantage of function. They allow us to design and develop freely (and safely!).

LOW-level languages have the advantage of speed in most cases. There is a cost to interpreting HIGH-level code. Plus, it is kinda cool to write something in 'computer speek'.

Hope this helps

Nick Burns
  • 306
  • 2
  • 4
5

High-level vs. low-level is not a black-and-white thing, but a continuous scale. The terms are used to describe how close a programming language is to the hardware; the higher the level, the more it abstracts the hardware away.

The lowest level, obviously, is binary machine code - it is the exact representation the OS loads and feeds to the CPU. Assembly is the first level of abstraction built on top of it: instead of binary code, one writes mnemoics, human-readable symbolic codes that represent binary machine instructions. This is what people used for systems programming before UNIX.

C is the next step up in the abstraction chain, bundling common patterns into flow control constructs and abstracting machine-specific instructions into platform-agnostic syntax, and this last abstractions was one of the major factors that made UNIX both revolutionary and highly successful, because it meant that the same code could be compiled for any platform without any major changes.

C++ adds another layer of abstractions: it adds classes (abstracting vtables and context passing into an OOP syntax), new and delete (bundling memory allocation and variable initialization into a single construct), compile-time type checking, templates (type-safe compile-time metaprogramming), and a bunch of compile-time syntax conveniences like namespaces, function and operator overloading, etc.

Python takes another big step away from the hardware. C++ still gives the programmer full control over memory allocation, and allows for direct manipulation of RAM; Python takes care of memory management for you. Additionally, instead of compiling your code to all-native machine instructions, it runs it against a virtual machine; this carries a performance penalty (which can sometimes be hefty, but usually isn't something to worry about), but it also allows for neat things that would be tricky in C++ and excruciatingly hard in C, such as manipulating functions and classes at run time, getting the names of arbitrary objects at run time, instantiating classes by name at run time, monkey-patching, etc. etc.

So when people divide languages into "high level" and "low level" ones, they draw an arbitrary line somewhere, and that line isn't always the same. In 1970, the line was between assembly and C (abstracting away platform-specific machine instructions being the decisive factor); in 1987, it may have been somewhere between C and C++; today, it may be between C++ and Java (with automatic memory management as the decisive factor).

Long story short: high-level-ness is a sliding scale, and for the three languages you mention it's C < C++ < Python.

tdammers
  • 52,406
  • 14
  • 106
  • 154
  • I'd say high-level vs low-level isn't one scale, but are instead two separate scales. Low-level-ness relates to how well a language relates to machine behavior, while high-level-ness relates to its ability to provide an abstraction. C# is more of a high-level language than C99, but is *also* lower-level than the language defined by the C Standard, since the behavior of e.g. using an "int" pointer to process "short" values in an array two at a time is defined in C#, but not in C99. – supercat Jun 10 '16 at 06:48
3

The line between the "low-level" and the "high-level" languages shifts from time to time.
For example:
Back in the days of UNIX, C was a high level language.
Today C doesn't have the structures like the mapping types(dictionaries), iterators etc. which today's high-level languages like Python have. So the line has shifted, and C has now fallen into the low-level group.

Low-Level Languages:
These languages are "close" to what the machine can execute (the lowest level being : Assembly Code!).
When working with these languages, the programmer has to think about the lowest level stuff like memory management.. You are close in that sense to the hardware, that you have to directly work with it.

High-level Languages:
These languages take you away from the hardware, as they manage things like memory themselves. When you work with these languages, memory is a factor(obviously), but you don't work with the hardware directly. Instead the language manages that, keeping you away (maybe higher) from the lower, hardware interface.

pradyunsg
  • 245
  • 2
  • 12