136

I am considering learning C.

But why do people use C (or C++) if it can be used 'dangerously'?

By dangerous, I mean with pointers and other similar stuff.

Like the Stack Overflow question Why is the gets function so dangerous that it should not be used?. Why do programmers not just use Java or Python or another compiled language like Visual Basic?

Tristan
  • 1,295
  • 2
  • 9
  • 11
  • Comments are not for extended discussion; this conversation has been [moved to chat](http://chat.stackexchange.com/rooms/40912/discussion-on-question-by-tristan-t-why-do-people-use-c-if-it-is-so-dangerous). –  Jun 08 '16 at 22:37
  • 172
    Why do chefs use knives, if they can be used 'dangerously'? – oerkelens Jun 09 '16 at 08:24
  • 82
    With Great Power Comes Great Responsibility. – Pieter B Jun 09 '16 at 11:15
  • There was [a related question](http://security.stackexchange.com/questions/117059) on the Security site recently, though it was closed. – Dan Getz Jun 09 '16 at 11:55
  • 2
    Also be aware that elements of C's syntax a quite pervasive https://en.wikipedia.org/wiki/List_of_C-family_programming_languages – TJA Jun 09 '16 at 12:10
  • Same reason why would I cook my own meals if it's quicker to just buy McDonalds or a frozen dinner? Sometimes I need a meal quickly, and sometimes I can take the time for a home-cooked meal. – Ed Griebel Jun 09 '16 at 12:44
  • Scalpels are very dangerous if you're not careful. – John U Jun 09 '16 at 13:02
  • 1
    'Tis a poor knife that cuts one way. – Tony Ennis Jun 09 '16 at 13:25
  • @TJA Strange they don't list any ActionScript versions. – Panzercrisis Jun 09 '16 at 15:42
  • 1
    @PieterB came here from the sidebar _specifically_ to write that comment. Well done chap – MDMoore313 Jun 09 '16 at 17:45
  • 10
    Joe Blow, pontificate much? – Matthew James Briggs Jun 09 '16 at 19:45
  • 1
    It will be interesting to see if, in 50 years, C becomes the new FORTRAN. – RockPaperLz- Mask it or Casket Jun 10 '16 at 06:27
  • Something else that seems to have been largely overlooked. Is this your first language or do you have several languages already under your belt. Also what is it that you are intending to do with C once you've learned it. – TJA Jun 10 '16 at 08:05
  • As to why I use C, it's when I need to be a control freak but not quite down in the weeds with actual assembler code. Note most C compilers are probably actually better at generating efficient code than people these days. – TJA Jun 10 '16 at 08:08
  • 3
    With Great Responsibility Comes Great Power – Tim Boland Jun 10 '16 at 08:31
  • 1
    I still don't know why people write 600 character comments answering the question when they could just write up an _answer_. Comment section isn't for answers.. – Insane Jun 10 '16 at 08:32
  • 1
    What is the fun if there is no danger? ;-) – Marco Jun 10 '16 at 09:04
  • @BigHomie, snap! – Joseph Rogers Jun 10 '16 at 09:21
  • 5
    Because "Back In The Day" when C became the Language of Choice we were expected to be able to handle stuff like that, because we had to. Interpreted or byte-coded languages were too slow because the processors of the day were so much slower. (Today I can buy a low-end desktop PC with a **2+ GHz multi-core CPU** and 4 **GB** of memory from Dell for $279. You have NO IDEA how absolutely incredible this appears to a guy like me for whom a 4 **MHz** PC with 640 **kilobytes** of memory was bliss...). Face it - Moore's Law won. *Game.* ***Over!*** – Bob Jarvis - Слава Україні Jun 10 '16 at 14:11
  • 1
    Why do Jedi wield lightsabers if they're so dangerous? – Dan Jun 10 '16 at 14:18
  • It's an accident of history and nothing more. For the last decades, efficient languages happened to be unsafe (C, C++, etc) and safe languages happened to be inefficient (C#, Java, Python, etc). If you wanted an efficient language, you pretty much had to use an unsafe one, because there wasn't much competition. In the last few years this situation is changing, with languages that are both safe and efficient (Rust, Go, D, etc). If the new competition proves good enough, the old ones will be replaced. – Theodoros Chatzigiannakis Jun 11 '16 at 11:48
  • 6
    @Bob Jarvis: Game not over. If you think your 2+GHz, 4GB PC - or for that matter, your cluster of several hundred 4 GHz PCs with the latest CUDA GPUs, or whatever - is fast enough, you simply aren't working on hard enough problems :-) – jamesqf Jun 11 '16 at 23:03
  • 1
    Why do people drive sports cars? – Daniel R Hicks Jun 12 '16 at 12:26
  • 4
    @TheodorosChatzigiannakis Lumping C++ in with C in this context seems disengenuous to me. C++ provides lots of ways (if not necessarily the most ways) to ensure safety, but doesn't burden the user with them by default like many Other Languages do. That's why we have this catchphrase: _You don't pay for what you don't use._ In most cases, it can be as carefree/fast or as paranoid/slow as you want, or anywhere in between. It's just a question of how the programmer uses it. – underscore_d Jun 12 '16 at 15:49
  • @underscore_d Yes, C++ doesn't make you pay for what you don't use -- and whenever safety costs, it's off by default. But that's exactly what we mean when we say that the language is unsafe! Compare with Rust, which also doesn't make you pay for what you don't use, but safety is on by default -- because there is no cost to pay (thanks to the design that lends itself to powerful static analysis). And if you are willing to pay costs, then almost any language, regardless of its design, can be used in a safe manner -- including C. – Theodoros Chatzigiannakis Jun 12 '16 at 21:21
  • @DanielRHicks - Because We Can!!!!! :-) – Bob Jarvis - Слава Україні Jun 13 '16 at 02:52
  • 1
    And I'm sure it's written in the Constitution somewhere. – Daniel R Hicks Jun 13 '16 at 03:07
  • possible duplicate of [Is the C programming language still used?](http://programmers.stackexchange.com/questions/103897/is-the-c-programming-language-still-used) – gnat Jun 20 '16 at 07:51
  • see also: [What makes C so popular in the age of OOP?](http://programmers.stackexchange.com/questions/141329/what-makes-c-so-popular-in-the-age-of-oop) – gnat Jun 20 '16 at 07:51
  • Read the story "With Folded Hands". Pray. Repeat. –  Aug 10 '17 at 18:07

16 Answers16

249
  1. C predates many of the other languages you're thinking of. A lot of what we now know about how to make programming "safer" comes from experience with languages like C.

  2. Many of the safer languages that have come out since C rely on a larger runtime, a more complicated feature set and/or a virtual machine to achieve their goals. As a result, C has remained something of a "lowest common denominator" among all the popular/mainstream languages.

    • C is a much easier language to implement because it's relatively small, and more likely to perform adequately in even the weakest environment, so many embedded systems that need to develop their own compilers and other tools are more likely to be able to provide a functional compiler for C.

    • Because C is so small and so simple, other programming languages tend to communicate with each other using a C-like API. This is likely the main reason why C will never truly die, even if most of us only ever interact with it through wrappers.

  3. Many of the "safer" languages that try to improve on C and C++ are not trying to be "systems languages" that give you almost total control over the memory usage and runtime behavior of your program. While it's true that more and more applications these days simply do not need that level of control, there will always be a small handful of cases where it is necessary (particularly inside the virtual machines and browsers that implement all these nice, safe languages for the rest of us).

    Today, there are a few systems programming languages (Rust, Nim, D, ...) which are safer than C or C++. They have the benefits of hindsight, and realize that most of the times, such fine control is not needed, so offer a generally safe interface with a few unsafe hooks/modes one can switch to when really necessary.

  4. Even within C, we've learned a lot of rules and guidelines that tend to drastically reduce the number of insidious bugs that show up in practice. It's generally impossible to get the standard to enforce these rules retroactively because that would break too much existing code, but it is common to use compiler warnings, linters and other static analysis tools to detect these sorts of easily preventable issues. The subset of C programs that pass these tools with flying colors is already far safer than "just C", and any competent C programmer these days will be using some of them.


Also, you'll never make an obfuscated Java contest as entertaining as the obfuscated C contest.

Robert Harvey
  • 198,589
  • 55
  • 464
  • 673
Ixrec
  • 27,621
  • 15
  • 80
  • 87
  • Comments are not for extended discussion; this conversation has been [moved to chat](http://chat.stackexchange.com/rooms/40913/discussion-on-answer-by-ixrec-why-do-people-use-c-if-it-is-so-dangerous). –  Jun 08 '16 at 22:39
  • 1
    "With great power, comes great responsibility" – user2762451 Jun 09 '16 at 11:23
  • 7
    "lowest common denominator" sounds disparaging. I'd say that, unlike the many far heavier languages, C doesn't enforce a tonne of extra baggage you don't need and gives you a lightning-fast base on which to implement stuff you do need. Is that what you meant? – underscore_d Jun 09 '16 at 11:57
  • 9
    "You can't really have one without the other.": actually many new languages (Rust, Nim, D, ...) attempt to. This basically boils down to matching a "safe" subset of the language with a couple "unsafe" primitives for when this level of control is absolutely necessary. However, all of those build on the knowledge accumulated from C and C++, so maybe it should be said that at the time C and C++ were developed, you could not have one without the other, and nowadays there are language that attempt to partition off their "unsafe" bits, but they have not caught up yet. – Matthieu M. Jun 09 '16 at 12:00
  • @MatthieuM. Note that a lot of this has already been discussed in the comment chain that got moved to chat. Best not to rehash all of it again. But feel free to make a suggested edit. – Ixrec Jun 09 '16 at 12:18
  • @Ixrec: Yes, I read the chat, I was just feeling that it might be useful to include it in the answer. I'll try to edit then, feel free to revert if you're not happy with it. – Matthieu M. Jun 09 '16 at 12:21
  • 6
    1) is not a valid point! C took features from Algol 68, which is a "safe" language in this sense; so the authors knew about such languages. The other points are great. – reinierpost Jun 09 '16 at 15:44
  • 1
    @MatthieuM. Nice edit; thank you. That sentence bothered me as well, because it was quite misleading. However, I do think it's important to note that Rust without `unsafe` blocks is strictly *less powerful* (in the sense of "what can be implemented") than Rust *with* `unsafe` blocks, so in some sense it really is true that some level of "unsafety" is *necessary* for certain types of "power" (though I think the word "power" is easily abused here because it's semantically overloaded). I'm guessing that, for instance, a device driver couldn't be implemented without `unsafe`. (I could be wrong.) – Kyle Strand Jun 09 '16 at 16:28
  • Obfuscated C contest would be a lot more boring without cpp. – Thorbjørn Ravn Andersen Jun 09 '16 at 21:59
  • @MatthieuM. Agreed on all counts! – Kyle Strand Jun 10 '16 at 06:19
  • 4
    @MatthieuM. You might want to look at Microsoft Research as well. They've produced a few managed OSes over the past decade or two, including some that are faster (completely accidentally - the primary goal of most of these research OSes is safety, not speed) than an equivalent unmanaged OS for certain real-life workloads. The speedup is mostly due to the safety constraints - static and dynamic executable checking allows optimizations that aren't available in unmanaged code. There's quite a bit to look at, all in all, and sources are included ;) – Luaan Jun 10 '16 at 13:13
  • 3
    @Luaan: Midori looks so awesome; I am always looking forward to Joe's blog articles. – Matthieu M. Jun 10 '16 at 13:28
  • @KyleStrand: Device drivers generally require access to the underlying hardware, which can be difficult to do in a managed language. Also, I can imagine some *very* tight loops. – Robert Harvey Jun 10 '16 at 15:59
  • @RobertHarvey Did you mean to address that comment to Luaan? I don't think I brought up managed languages....? (Rust is not managed, if that's what you're thinking....) – Kyle Strand Jun 10 '16 at 16:52
  • @KyleStrand: Ah, that's very interesting, I didn't know that. Generally, I associate "unsafe" code with C, and "not unsafe" code as managed. – Robert Harvey Jun 10 '16 at 16:58
  • 3
    @RobertHarvey Which is *exactly* why Rust is so promising! Please do yourself the favor of looking into it. I think your safe/managed dichotomy view is pretty common in the C/C++ community--for instance, [I was once explaining a problem in C++ that, I claimed, wouldn't arise in Rust](https://disqus.com/home/discussion/cjdrake/chris_j_drake/#comment-2489753913), and someone responded, "how does Rust deal with this situation? My theory - garbage-collected pointers. :-)". I believe an early version of Rust *did* have a GC, but as of version 1.0 it was long gone. – Kyle Strand Jun 10 '16 at 22:04
  • 1
    @RobertHarvey Another intriguing response from that thread -- "It really saddens me to see people blaming C++ while the compiler is doing its best (and will do better with concepts) to scream their logical mistake at them." This was from someone who claimed my C++ problem was really a "logic problem" because they simply didn't understand the solution I was trying to describe which *works perfectly fine in Rust* (but can't be implemented with the C++14 standard library). – Kyle Strand Jun 10 '16 at 22:06
  • Also, this comment thread has gotten a bit out of hand... [if only we had some way to migrate to chat](http://meta.stackexchange.com/q/96247/218334). – Kyle Strand Jun 10 '16 at 22:21
  • 2
    If only this comment thread wasn't a complete rehash of the previous comment thread that was migrated to chat days ago... – Ixrec Jun 10 '16 at 22:29
  • 3
    > *A lot of what we now know about how to make programming "safer" comes from experience with languages like C.* Completely false. We had safe(er) compiled systems languages before C. Programmers saw C and preferred it to the boring, bondage-and-discipline cruft that made them go through verbose hoops to just get to a word in memory and do what they want (if even possible at all). – Kaz Jun 11 '16 at 15:03
  • > *is a much easier language to implement because it's relatively small,* Have you seen the ISO C standard lately? – Kaz Jun 11 '16 at 15:04
  • 2
    At the time C exploded in popularity, it *was* a quite small language, and that, more than opinions about bondage-and-discipline languages, contributed to its growth. A huge number of new platforms got C because you generally need only a small amount of system-specific code to bring a C compiler up. – Russell Borogove Jun 12 '16 at 03:03
  • "Good judgment comes from experience. Experience comes from bad judgement." And, the resulting judgment often makes us needlessly cautious and overprotective of others just learning. Of course, they might end up writing code for our self-driving cars, so maybe it is *good* to be overcautious... Most accidents happen at home, in the bathroom, with kitchen in second place. Can't make everything safe apparently. Inflatable toilets and bathtubs have not caught on. –  Aug 10 '17 at 18:12
  • While it's a common conflation, "systems language" wasn't intended to be a synonym for "low-level language" or "bare-metal language" but, rather, a label for languages that are designed to manage and mitigate complexity over the full lifecycle of a large, long-lived project (typically used as infrastructure that others build on) with shifting maintainership. See [What is Systems Programming, Really?](https://willcrichton.net/notes/systems-programming/) by Will Crichton for a well-cited run-down of that. By that definition, Java and Go are both systems languages, just not bare-metal ones. – ssokolow Sep 30 '21 at 23:25
41

First, C is a systems programming language. So, for example, if you write a Java virtual machine or a Python interpreter, you will need a systems programming language to write them in.

Second, C provides performance that languages like Java and Python do not. Typically, high performance computing in Java and Python will use libraries written in a high-performance language such as C to do the heavy lifting.

Third, C has a much smaller footprint than languages like Java and Python. This makes it usable for embedded systems, which may not have the resources necessary to support the large run-time environments and memory demands of languages like Java and Python.


A "systems programming language" is a language suitable to build industrial-strength systems with; as they stand, Java and Python are not systems programming languages. "Exactly what makes a systems programming language" is outside the scope of this question, but a systems programming language does need to provide support for working with the underlying platform.

On the other hand (in response to comments), a systems programming language does not need to be self-hosting. This issue came up because the original question asked "why do people use C", the first comment asked "why would you need a language like C" when you have PyPy, and I noted that PyPy does in fact use C. So, it was originally relevant to the question, but unfortunately (and confusingly) "self-hosting" is not actually relevant to this answer. I'm sorry I brought it up.

So, to sum up: Java and Python are unsuited to systems programming not because their primary implementations are interpreted, or because natively compiled implementations are not self-hosted, but because they don't provide the necessary support for working with the underlying platform.

Robert Harvey
  • 198,589
  • 55
  • 464
  • 673
comingstorm
  • 2,727
  • 1
  • 13
  • 14
  • 20
    "if you write a Java virtual machine or a Python interpreter, you will need a systems programming language to write them in." Uh? How do you explain PyPy? Why would you need a language like C to write a compiler or an interpreter or a virtual machine? – Vincent Savard Jun 07 '16 at 19:03
  • I'm not familiar with PyPy, but the wikipedia article claims that "Current PyPy versions are translated from RPython to C code and compiled". And, you are correct that writing a self-hosting native compiler does not require a systems programming language. However, if you read carefully, I don't believe I claimed otherwise... – comingstorm Jun 07 '16 at 19:11
  • 10
    I do believe you claimed you need a system programming language to write a Java virtual machine or a Python interpreter when you said "if you write a Java virtual machine or a Python interpreter, you will need a system programming language to write them in". If PyPy doesn't satisfy you, you can also look any interpreter or compiler written in Haskell. Or really, just add a reference supporting your claim. – Vincent Savard Jun 07 '16 at 19:18
  • 5
    Has anyone written a Python interpreter or a JVM in a fully self-hosting Python compiler? My point was that industrial-strength virtual machines and interpreters are tied in closely with their host systems -- they have to be, in order to provide the performance and features that make them acceptable to their users. And, to do this effectively, as a practical matter of engineering, you need to use *some* kind of systems programming language... even if you need to remake your self-hosting Python compiler into one! – comingstorm Jun 07 '16 at 19:33
  • 1
    I understand your point and I don't disagree with it, but it isn't what you wrote. I gave PyPy as an example because it was originally written in pure Python. – Vincent Savard Jun 07 '16 at 19:36
  • 1
    There seems to be some kind of disconnect, here. That *is* what I wrote, and it is a claim that you have not actually addressed. Python and Java are not systems programming languages, so they are ill-suited to writing systems like virtual machines. And, even if PyPy, or Haskell for that matter, were fully self-hosting, you would need to do quite a lot of work to make them usable *as* systems programming languages... – comingstorm Jun 07 '16 at 19:50
  • 17
    Even if it's turtles all the way down, something like PyPy could not exist without a Python interpreter written in some other language. – Blrfl Jun 07 '16 at 20:04
  • 18
    @Birfl But that's not really saying much. C couldn't have been written without an assembly compiler, and you can't write assembly without hardware that implements it. – gardenhead Jun 08 '16 at 00:15
  • @VincentSavard: What are you talking about? PyPy has *never* been Python all the way down. It has *always* relied on a C layer at some point in the stack. – Kevin Jun 08 '16 at 06:18
  • 11
    If PyPy is not a "systems programming language" because a C compiler is involved someplace, then C is not a systems programming language either because an assembler is used someplace. Actually, isn't it popular to translate C into some other language these days? e.g. LLVM –  Jun 08 '16 at 06:30
  • @Hurkyl this way we may come to say that there are no systems programming languages. To compile a C program an assembler is involved someplace. Even when you write in machine code, you still need a hex(bin)-to-file converter. – Ruslan Jun 08 '16 at 14:58
  • 1
    @comingstorm "Has anyone written a Python interpreter or a JVM in a fully self-hosting Python compiler?" So the argument is that because nobody has done it, it's impossible? As a matter of fact if you look up development of old programming languages, it was quite common to write a mini compiler in assembler that could be used to translate some basic compiler already written in a stripped down version of target language and then bootstrap yourself from there. This works just as well for C as it does for Python, Java or Lisp. – Voo Jun 08 '16 at 16:19
  • 1
    I think there are two issues here: the "self-hosting" one, which I think is a distraction, and the "systems programming" one, which is the relevant one, the one I had intended to address, and frankly the one I find more interesting. I think the greater part of "systems-programming nature" is likely found in the implementation and environment, but it also requires substantial support in the base semantics and standard library of the language itself... – comingstorm Jun 08 '16 at 17:00
  • I have added a section to my answer which will hopefully clarify it, and address the resulting discussion. – comingstorm Jun 08 '16 at 17:38
  • 1
    C was designed as a systems-programming language, but compiler development, at least on the PC, is getting away from the systems-programming roots. Given `uint32_t x,*p`, an operation like `x=*p;` used to mean "use p to identify a 32-bit word and read its contents into x, at worst yielding an Unspecified Value if it `x` identifies storage the program owns but which doesn't hold anything meaningful". With today's compilers, however, such a statement could have grave consequences even if code would be equally happy with any value that might be stored into `x`. – supercat Jun 08 '16 at 20:47
  • 4
    A lot of "industrial strength" programs are written in high level language. In fact I'd argue that there are more industrial strength programs written in higher level language than written in C. The main reason why C is popular for systems programming is a combination of C being relatively simple to implement, and WinAPI and Unix system calls being specified in documentations in terms of C calls and systems calls wrappers are generally written first in assembly to support the first language ported to the system, which is usually C. – Lie Ryan Jun 09 '16 at 04:14
  • 1
    @VincentSavard Why does PyPy exist? Answer: because it was a nice intellectual exercise for someone. I don't solve killer sudokus because there's a purpose behind it. People do stuff because it seems like it'd be fun - it doesn't mean that the result has any practical purpose. – Graham Jun 09 '16 at 11:37
  • @Graham It's quite ignorant to say PyPy has no practical purpose. That being said, if you think the point I was making is about PyPy, you completely misunderstood what I was saying. – Vincent Savard Jun 09 '16 at 12:40
  • A different take on "what is a _systems_ programming language?" A systems programming language is a language you can use to write code that runs on the bare metal. You can use it to write firmware, and you can use it to write the operating _system_. – Solomon Slow Jun 09 '16 at 17:50
  • @jameslarge: I would suggest that a key aspect of a systems programming language is that the compiler should not try to guess what the programmer may or may not know. If a compiler allows malloc and friends to be replaced with programmer-supplied alternatives, or if they chain to OS library functions, the compiler should regard `int *p = malloc(12), q=p[-1]`; as defined behavior (fetching the word preceding the allocation identified by p) since it's entirely possible that the programmer might know what the function does with the word preceding the allocation even if the compiler doesn't. – supercat Jun 09 '16 at 19:39
  • @jameslarge: Unfortunately, there is an increasing gap between low-level dialects of C which are suitable for such programming, and higher-level dialects which optimize on the assumption that code will never receive inputs that would lead to actions invoking Undefined Behavior, without regard for whether the programmer would have been to accept the "natural" consequences of those actions on that particular platform. – supercat Jun 09 '16 at 19:42
  • The Jikes RVM (JVM) is written in Java. https://en.wikipedia.org/wiki/Jikes_RVM – Thorbjørn Ravn Andersen Jun 09 '16 at 22:02
  • 1
    @jameslarge You can use any language you want to write an operating system or firmware, as long as it compiles down to machine code of whatever CPU/microcontroller you are using. So I don't think that's a good distinction - unless you're trying to say the distinction is completely meaningless :) – Luaan Jun 10 '16 at 13:18
  • @Luaan, Re, "as long as it compiles down to machine code..." I don't think that is a meaningless distinction. I think that's an essential distinction. It also is necessary for the primitive data types and operations in the language to map directly onto data types, operations, and addressing modes that are supported by the hardware. I said, "a language that you _can_ use" to write bare-metal code. Maybe I should have said, "a language that you would _want_ to use..." because that would rule out a lot of languages that require heavy run-time support. – Solomon Slow Jun 10 '16 at 13:42
  • @jameslarge Well, C doesn't have what you just enumerated. C's primitive data types map on types on an abstract machine, no real machine. Operations are just a small subset of operations available on hardware, and include operations that most hardware didn't support at all. Addressing modes are again 100% abstracted away. So is C a systems language or not? There are examples of C# being used to write an operating system, both with and without the .NET runtime (including my own rather bare system). It has nothing to do with the language whatsoever. And I don't mean Systems C# either :) – Luaan Jun 12 '16 at 08:19
  • I'd just like to to say that many people don't understand bootstrapping. You could write Python interpreter in Python, and just execute it with other Python implementation. C, CL, and other languages do the same thing; of course, you need to get _first_ implementation from somewhere, but that's the problem that C has too. Oh, and Common Lisp is really high language, yet you can write OS in it: https://github.com/froggey/Mezzano . Strange? – MatthewRock Jun 12 '16 at 13:04
  • **The comments section is not for extended discussion, if you would like to discuss this answer further then please use our Chat room. Thank you.** – maple_shaft Jun 13 '16 at 11:31
34

Sorry to add yet another answer, but I don't think any of the existing answers directly address your first sentence stating:

'I am considering learning C'

Why? Do you want to do the kinds of things C is usually used for today (e.g. device drivers, VMs, game engines, media libraries, embedded systems, OS kernels)?

If yes, then yeah, sure learn C or C++ depending on which of those you're interested in. Do you want to learn it so you'll have a deeper understanding of what your high-level language is doing?

You then go on to mention the safety concerns. You don't necessarily need a deep understanding of safe C to do the latter, in the same way that a code example in a higher-level language might give you the gist without being production ready.

Write some C code to get the gist. Then put it back on the shelf. Don't worry too much about safety unless you want to write production C code.

Jared Smith
  • 1,620
  • 12
  • 18
  • 3
    Great job answering what seems to be the real question! A great way to appreciate C/C++ and "safer" languages for all they really are is to try writing something like a simple database engine. You'll get a good feel for each of the approaches you try, and see where that leads you. You'll see what feels natural in either, and you'll find where the natural approach fails (e.g. it's very easy to "serialize" raw data in C - just write the data!; but the result isn't portable, so it may be of limited use). Understanding safety is tricky, because most of the issues may be hard to encounter. – Luaan Jun 08 '16 at 12:44
  • 1
    @Luaan exactly, learning how to copy a string using pointers gives you the idea, learning how to do so safely is another level, and depending on one's goals a perhaps unnecessary one. – Jared Smith Jun 08 '16 at 13:09
  • 2
    This was not really the question. But it was helpfull for making a dissision. I decided to do it. I am willing to learn more about the inner working of everything is build on. And I just like programming. This way I hoping to understand the inner workings of computers better. It is just for fun. – Tristan Jun 08 '16 at 18:57
  • 10
    I disagree with the last sentence. Learn how do do it right before developing bad habits. – glglgl Jun 09 '16 at 07:20
  • 2
    @glglgl IDK, if I read (or write) a JavaScript snippet on the web I do so with the understanding that its not production ready: it won't have exception handling, it might be O(n^2), etc. None of that is necessary to get the point across. All of it is necessary for production code. Why is this different? I can write naive C for my own edification while understanding *intellectually* that if I wanted to put it out there I'd need to do a lot more work. – Jared Smith Jun 09 '16 at 14:15
  • 1
    @JaredSmith That's a different point. But if I learn a programming language, I slowly become accustomed to it. And if I am fluent with it and then I detect that I do some things fundamentally wrong (concerning security), it would been better to have detected this before having written thousands of lines of code which I now have to inspect whether they are safe to use. – glglgl Jun 09 '16 at 14:19
  • @glglgl fair enough. – Jared Smith Jun 09 '16 at 14:20
  • @TristanT then I wish you well of it! Glad it was helpful. – Jared Smith Jun 10 '16 at 14:05
16

It is funny that you claim C is unsafer because "it has pointers". The opposite is true: Java and C# have practically only pointers (for non-native types). The most common error in Java is probably the Null Pointer Exception (cf. https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare). The second most common error is probably holding hidden references to unused objects (e.g. closed Dialogues are not disposed of) which therefore cannot be released, leading to an ever-growing memory foot print of long-running programs.

There are two basic mechanisms which make C# and Java safer, and safer in two different ways:

  • Garbage collection makes it less likely that the program attempts to access discarded objects. This makes the program less likely to terminate unexpectedly. As opposed to C, Java and C# by default allocate non-native data dynamically. This makes the program logic actually more complex, but the built-in garbage collection -- at a cost -- takes over the hard part.

    Recent C++' smart pointers make that job easier for programmers.

  • Java and C# compile to an intermediate code which is interpreted/executed by an elaborate run time. This adds a level of security because the run time can detect illicit activities of a program. Even if the program is coded insecurely (which is possible in both languages), the respective run time in theory prevents "breaking out" into the system.
    The run time does not protect against e.g. attempted buffer overruns, but in theory does not allow exploits of such programs. With C and C++, by contrast, the programmer has to code securely in order to prevent exploits. This is usually not achieved right away but needs reviews and iterations.

It is worth noting though that the elaborate run time is also a security risk. It appears to me that Oracle is updating the JVM every couple of weeks because of newly discovered safety issues. Due to its complexity, the JVM is much harder to verify than most specific programs.

The safety of an elaborate run time is therefore ambiguous and to a degree deceiving: Your average C program can, with reviews and iterations, be made reasonably secure. Your average Java program is only as secure as the JVM; that is, not really. Never.

The article about gets() that you link to reflects historical library decisions which would be made differently today, not the core language.

  • 4
    I think the point of the original author is that in C you're able to take unchecked action on those pointers. In java, you get a nice exception right away - in C, you might not realize you're reading an invalid location until your application state is corrupted. – Sam Dufel Jun 08 '16 at 15:28
  • 1
    Also, http://stackoverflow.com/questions/57483/what-are-the-differences-between-a-pointer-variable-and-a-reference-variable-in - it would be more accurate to say that java has "practically only references". – Sam Dufel Jun 08 '16 at 15:31
  • 5
    I've seen all kinds of interesting bugs result from attempts to avoid explicit pointers. Usually the fundamental issue is confusion between copy and reference. In C, if I pass you a pointer to something, you know that modifying it will affect the original. Some of the attempts to avoid pointers obfuscate this distinction, leading to a maze of deep copies, shallow copies, and general confusion. – Arlie Stephens Jun 08 '16 at 21:28
  • Asserting that Oracle's JVM sucks (from the standpoint of hemorrhaging exploitable security vulnerabilities) and therefore the runtimes of managed languages in general introduce more security concerns than using a managed language avoids is like saying that Adobe Flash is horrific as a source of insecurity and program that does video & animation playback from a web site must inherently be ridiculously insecure. Not all Java runtimes are nearly as bad as Oracle/Sun's 1990's-vintage JVM abomination, and not all managed languages are Java. (Well, obviously.) – mostlyinformed Jun 10 '16 at 05:50
  • @halfinformed Well; I was saying that a program is only as secure as its runtime, and that "your average" (read: small-ish) program can with an effort be made safer than any large runtime like a byte code interpreter. That statement seems undeniable. Whether a particular stand-alone program or a particular runtime is more or less secure than the other depends on their respective complexity and design, coding and maintenance quality. For example, I wouldn't say that sendmail is more secure than Oracle's Java VM; but qmail may be. – Peter - Reinstate Monica Jun 10 '16 at 08:50
14

This is a HUGE question with tons of answers, but the short version is that each programming language is specialized for different situations. For example, JavaScript for web, C for low level stuff, C# for anything Windows, etc. It helps to know what you want to do once you know programming to decide what programming language to pick.

To address your last point, why C/C++ over Java/Python, it often comes down to speed. I make games, and Java/C# are just recently reaching speeds that are good enough for games to run. After all, if you want your game to run at 60 frames per second, and you want your game to do a lot (rendering is particularly expensive), then you need the code to run as fast as possible. Python/Java/C#/Many others run on "interpreters", an extra layer of software that handles all the tedious stuff that C/C++ doesn't, such as managing memory and garbage collection. That extra overhead slows things down, so nearly every large game you see was done (in the last 10 years, anyway) in C or C++. There are exceptions: the Unity game engine uses C#*, and Minecraft uses Java, but they're the exception, not the rule. In general, big games running on interpreted languages are pushing the limits of how fast that language can go.

*Even Unity is not all C#, huge chunks of it are C++ and you just use C# for your game code.

EDIT To respond to some of the comments that showed up after I posted this: Perhaps I was oversimplifying too much, I was just giving the general picture. With programming, the answer is never simple. There are interpreters for C, Javascript can run outside the browser, and C# can run on just about anything thanks to Mono. Different programming languages are specialized for different domains, but some programmer somewhere probably figured out how to get any language to run in any context. Since the OP appeared to not know much programming (assumption on my part, sorry if I'm wrong), I was trying to keep my answer simple.

As for the comments about C# being nearly as fast as C++, the key word there is nearly. When I was in college, we toured many game companies, and my teacher (who had been encouraging us to move away from C# and into C++ the whole year) asked programmers at every company we went to why C++ over C#, and every single one said C# is too slow. In general it runs fast, but the garbage collector can hurt performance because you can't control when it runs, and it has the right to ignore you if it doesn't want to run when you recommend it does. If you need something to be high performance, you don't want something as unpredictable as that.

To respond to my "just reaching speeds" comment, yeah, much of C#'s speed increases come from better hardware, but as the .NET framework and C# compiler have improved, there have been some speedups there.

About the "games are written in the same language as the engine" comment, it depends. Some are, but many are written in a hybrid of languages. Unreal can do UnrealScript and C++, Unity does C# Javascript and Boo, many other engines written in C or C++ use Python or Lua as scripting languages. There isn't a simple answer there.

And just because it bugged me to read "who cares if your game runs at 200fps or 120fps", if you're game is running faster than 60fps, you're probably wasting cpu time, since the average monitor doesn't even refresh that fast. Some higher end and newer ones do, but its not standard (yet...).

And about the "ignoring decades of tech" remark, I'm still in my early 20's, so when I'm extrapolating backwards, I'm mostly echoing what older and more experienced programmers have told me. Obviously that'll be contested on a site like this, but its worth considering.

Cody
  • 121
  • 4
Cody
  • 165
  • 2
  • 4
    "*C# for anything Windows*" - Oh, that's such a fallacy. And you even provide an example. Unity. AFAIK It's not written it provides C# API because the language is nice an adaptable. It's really well designed. And I like c++ more, but the credit should be given where it's due. Maybe you mixed C# with .NET? They hang out together quite often. – luk32 Jun 08 '16 at 00:10
  • 3
    "Even Unity is not all C#, huge chunks of it are C++" And? Games in Unity often use C# extensively, and have been around for quite some time now. Suggesting that C# is 'just recently reaching speeds' either needs more context, or runs the risk of being blind to this decades tech. – NPSF3000 Jun 08 '16 at 03:12
  • :@NPSF3000 usually the situation is that hardware has gotten better to the point where the inefficiencies a language imposes (or encourages in terms of its programming practices) are no longer noticed. eg. Early Java was awful and a memory hog, but then memory got cheaper and CPUs got faster and it didn't matter as much. In the end who cares if your game does 200fps or 120 fps?! – gbjbaanb Jun 08 '16 at 07:28
  • When you say _Javascript for Web_ do you mean client-side progamming with its proliferation of compile-to-Javascript languages; or server-side programming which has at least the same options as what you refer to as _Windows_ programming? In truth, Javascipt is also suited for desktop application development (AKA _Windows_). Perhaps what you mean is that compiled languages like C# were less appropriate for the initially lighter demands of Web client-side programming. – Zev Spitz Jun 08 '16 at 10:56
  • 2
    Nearly every large game was written in the language of the engine it used: the amount of work that would have needed duplicating was so large that no other technical consideration was even worth taking into account. Rendering is indeed expensive, but nowadays that's all written in shaders and the language of the logic loop is irrelevant. – Peter Taylor Jun 08 '16 at 11:44
  • 3
    C# has always been JIT compiled (unlike Java, where your comment is correct), and it was quite capable of very similar execution speeds to C++ from the get go if you knew what you were doing. That's 2003 - not something I'd consider recent. Raw speed isn't the main issue for games (especially with programmable shaders on the GPU), there are other things that made languages like C# more or less popular at times. Two main issues are APIs (which are heavily C-oriented, and the interfacing may be expensive) and GC (mostly for latency issues, not raw throughput). – Luaan Jun 08 '16 at 12:22
  • 2
    @gbjbaanb It's not just CPUs being faster - a big deal is that C++ and C had decades to perfect their compilers and runtime, while Java started basically from zero (being designed as a multi-platform platform primarily). As the VM improved (e.g. the switch from interpreter to a JIT compiler, improved GC...), so did performance of Java applications. A lot of the edge C/C++ still has is in the "let's hope nothing breaks" approach - avoiding a lot of checks that are deemed "unnecessary". But it's still a huge memory hog - in fact, improvements to CPU usage often meant worse memory performance :) – Luaan Jun 08 '16 at 12:28
  • 2
    People forget that there is the caching issue, as the objects are generally larger in managed languages as well as inability to control and predict allocations CPU cache misses increase, leading to much worse real life performance. JIT compiled managed languages might reach the speed of C/C++ in matrix multiplication but if it is games or GUI we are talking about, they don't come close. – Cem Kalyoncu Jun 12 '16 at 12:34
  • @CemKalyoncu: On the flip side, a scanning GC makes it possible for code to manipulate references to owner-less immutable data holders more efficiently than can be readily accomplished in a reference-counted system, especially when the objects may be referenced via multiple threads. – supercat Jun 12 '16 at 22:42
  • Unless you are using managed pointers in C++, there is nothing related to reference counting in C/C++. You probably pass data vector as a reference to the thread in C++ thus you won't be using reference counting anyway. Even if you used reference counting, element access will not take longer as inlining will get rid of extra function calls. There are no checks in release code about a nullptr either. Thus only overhead will be at the entry to the thread function and at the exit. – Cem Kalyoncu Jun 13 '16 at 06:35
10

Because "safety" costs speed, the "safer" languages perform at a slower speed.

You ask why use a "dangerous" language like C or C++, have somebody write you a video driver or the like in Python or Java, etc. and see how you feel about "safety" :)

Seriously though, you have to be as close to the core memory of the machine to be able to manipulate pixels, registers, etc... Java or Python cannot do this with any type of performance-worthy speed... C and C++ both allow you to do this through pointers and the like...

Peter Mortensen
  • 1,050
  • 2
  • 12
  • 14
Wintermut3
  • 133
  • 4
  • 20
    It's not true in general that safety costs _speed_. The safest available languages do most of the checks at compile time. O'Caml, Ada, Haskell, Rust all don't trail far behind C in terms of average runtime speed. What they _do_ usually incur is significant overhead in program size, memory efficiency, latency, and obviously compile time. And, yes, they have difficulties with close-to-the-metal stuff. But that's not really a speed issue. – leftaroundabout Jun 08 '16 at 08:34
  • 6
    Also, C doesn't do what you think it does. C *is an abstract machine*. It doesn't give you direct access to anything - that's a good thing. You can't even look at modern assembly to see how much C is hiding from you - modern assembly (e.g. TASM) would have been considered a high level language back when C was developed. I'd be very happy if someone wrote drivers in a "safe" language, thank you - that would quite help avoiding plenty of those BSODs and freezes, not to mention security holes :) And most importantly, there's systems languages that are much safer than C. – Luaan Jun 08 '16 at 12:35
  • 2
    @leftaroundabout "close-to-the-metal" stuff is exactly my point--it isn't a speed issue? yes it is the overhead in most of the languages that you list that I am talking about. His question was why...I have asked myself that question a few times just after I blue-screened myself or locked up a machine from a bad deference...then I remember, because you can only do drivers, servers, most low level networking (packet level stuff) with C or even C++...in-other-words... bare-metal-stuff with a dangerous language like c/C++... – Wintermut3 Jun 08 '16 at 15:28
  • 3
    @Wintermute You really want to look up Rust before making comments about how safety features necessarily cost speed. Hell, C's low level type system actually *inhibits* many very useful optimisations that compilers could otherwise do (particularly when considering that pretty much no large c project manages to avoid to not violate strict aliasing *somewhere*). – Voo Jun 08 '16 at 16:25
  • 1
    @Voo "fair cop" about Rust, and I am not really trying to be confrontational about this, but my answer was in regards to the original question... and in retrospect maybe "safe" and "speed" were not good words to choose.. but my point is because you can do things in C/C++ you cannot in other more "safe" languages--unless you know of an NVidia driver written in Rust that I don't?? I used "speed" because that is an answer I have heard over and over from C/C++ programmers since I took the Java fork in the road from C/C++ over 18 years ago... – Wintermut3 Jun 08 '16 at 16:36
  • 7
    @Wintermute Yeah the myth that you can't make C/C++ any safer without introducing a performance overhead is very persistent which is why I take this rather serious (there are *some* areas where this is certainly true [bounds checking]). Now why is Rust not more widespread? History and complexity. Rust is still relatively new and many of the largest systems written in C existed before Rust was ever invented - you're not going to rewrite a million LOC in a new language even if it were much safer. Also every programmer and their dog knows C, Rust? Good luck finding enough people. – Voo Jun 08 '16 at 16:52
  • 3
    @Voo Dogs doing C?... no wonder I have seen so much bad code out there...j/k Point taken about Rust (I just downloaded and installed it, so you may have another convert to add) BTW in regards to Rust doing it right... I could make the same augment for "D" :) – Wintermut3 Jun 08 '16 at 17:21
  • @Wintermute D certainly looks interesting, but note that it although the GC is said to be optional, I've heard from numerous sources that turning off the GC more or less cripples the language. If this is the case, then Rust is probably a much better "systems" language. (Though I think we'll have to wait and see just how successful the Servo project and the Redox project are before we can be quite sure just *how* good Rust is as a systems language.) – Kyle Strand Jun 08 '16 at 17:36
  • 2
    @Wintermute In regards to Operating Systems and drivers written in Rust, you may want to check out [Redox](https://github.com/redox-os/redox). While admittedly not specifically NVidia drivers, they do appear to have VESA graphics drivers written mostly in Rust, with some bits dropping down to assembly (but do keep in mind that OSs written in C also have to do some bits in assembly as well, such as interrupt setup). – 8bittree Jun 08 '16 at 17:51
  • 2
    @wintermute I've heard lots of good things about D but never tried it, but definitely, it's not like rust has a monopoly on the idea of a safer systems language. – Voo Jun 08 '16 at 20:17
  • @leftaround One thing that's notably missing from the languages you mention is the inclusion of undefined behaviour. The fact that the C (and C++) spec allows for some behaviour to be undefined allows for some sneaky compiler optimisations, as well as some terrifying bugs in corner cases (compromising safety). Rust can use some of these optimisations because of its strong compile time checks (such as null pointer dereference), but some are more resistant to compile time checks (such as integer overflow), frustrating some optimisations (such as aggressive loop variable elimination). – James_pic Jun 14 '16 at 13:52
9

Besides all the above, there is also one pretty common use case, which is using C as a common library for other languages.

Basically, nearly all the languages have an API interface to C.

Simple example, try to create a common application for Linux/IOS/Android/Windows. Besides all the tools that are out there, what we ended up was doing a core library in C, and then changing the GUI for each environment, that is:

  • IOS: ObjectiveC can use C libraries natively
  • Android: Java + JNI
  • Linux/Windows/MacOS: With GTK/.Net you can use native libraries. If you use Python, Perl, Ruby each of them have native APIs interfaces. (Java again with JNI).

My two cents,

Nito
  • 91
  • 1
  • One of the reasons I love to use, say, PHP, is because almost all of its libraries are, indeed, written in C — thankfully so, or PHP would be unbearably slow :) PHP is great to write sloppy code without the fear of doing anything 'dangerous' (and that's why I tend to write much more PHP code than anything else — I _like_ sloppy code! :D ) , but it's nice to know that beneath a lot of those function calls there are the good ol' trusty C libraries to give it some performance boost ;-) By contrast, writing sloppy code in C is a big no-no... – Gwyneth Llewelyn Jun 12 '16 at 22:04
8

A fundamental difficulty with C is that the name is used to describe a number of dialects with identical syntax but very different semantics. Some dialects are much safer than others.

In C as originally designed by Dennis Ritchie, C statements would generally be mapped to machine instructions in predictable fashion. Because C could run on processors which behaved differently when things like signed arithmetic overflow occurred, a programmer who didn't know how a machine would behave in case of arithmetic overflow wouldn't know what C code running on that machine would behave either, but if a machine was known to behave a certain way (e.g. silent two's-complement wraparound) then implementations on that machine would typically do likewise. One of the reasons that C got a reputation for being fast was that in cases where programmers knew that a platform's natural behavior in edge-case scenarios would fit their needs, there was no need for the programmer or compiler to write code to generate such scenarios. It was vital that any code which used pointers to access memory make certain that pointers were never used to access things they shouldn't, which would typically require ensuring that computations involving pointers didn't overflow, but would not require paranoia about things like arithmetic overflow in other contexts.

Unfortunately, compiler writers have taken the view that since the Standard imposes no requirements on what implementations must do in such cases (laxity which was intended to allow for hardware implementations that might not behave predictably), compilers should feel free to generate code which negates laws of time and causality.

Consider something like:

int hey(int x)
{
   printf("%d", x);
   return x*10000;
}
void wow(int x)
{
  if (x < 1000000)
    printf("QUACK!");
  hey(x);    
}

Hyper-modern (but fashionable) compiler theory would suggest that the compiler should output "QUACK!" unconditionally, since in any case where the condition was false the program would end up invoking undefined behavior performing a multiply whose result was going to be ignored anyway. Since the Standard would allow a compiler to do anything it likes in such a case, it allows the compiler to output "QUACK!".

While C used to be safer than assembly language, when using hyper-modern compilers the reverse is true. In assembly language, integer overflow may cause a calculation to yield meaningless result, but on most platforms that will be the extent of its effects. If the results end up being ignored anyway, the overflow won't matter. In hyper-modern C, however, even what would normally be "benign" forms of Undefined Behavior (such as an integer overflow in a calculation which ends up being ignored) can cause arbitrary program execution.

supercat
  • 8,335
  • 22
  • 28
  • 1
    even in a hyper-modern compiler, C does not bounds-check on arrays. if it did so, that would not be compatible with the definition of the language. i use that fact sometimes to make arrays with an additional pointer into the middle of the array to have negative indices. – robert bristow-johnson Jun 09 '16 at 03:07
  • 1
    I'd like to see evidence of your example producing "QUACK!" unconditionally. x certainly *can* be greater than 1000000 at the point of comparison, and later evaluation which would result in overflow does not prevent that. More so, if you have inlining enabled which allows the overflowing multiply to be removed, your argument about implicit range restrictions does not hold. – Graham Jun 09 '16 at 11:35
  • @Graham: Nothing would forbid a compiler from performing the multiplication even though the result is ignored. Further, once the multiplication overflows, nothing would forbid the implementation from doing anything whatsoever, including outputting "QUACK!". I suspect most implementations would be likely to drop the multiplication before performing the semantic analysis that could otherwise reveal that it could be used to make the call to "QUACK!" unconditional, but nothing would require them to do so. – supercat Jun 09 '16 at 17:09
  • 2
    @robertbristow-johnson: Actually, the Standard quite explicitly says that given e.g. `int arr[5][[5]`, an attempt to access `arr[0][5]` will yield Undefined Behavior. Such a rule makes it possible for a compiler which is given something like `arr[1][0]=3; arr[0][i]=6; arr[1][0]++;` to infer that `arr[1][0]` will equal 4, without regard for the value of `i`. – supercat Jun 09 '16 at 17:12
  • @didn't say anything about 2-dim. `mytype _array[2*N+1]; mytype* array = &_array + N; for(n=-N; n<=N; n++) array[n] = n;` is safe and will work with any legit ANSI or K&R C. it's where i can have mathematically meaningful indices. can't do that with MATLAB which results in mathematical errors if i forget to offset and de-offset the indices. for a DSP guy, the best example is the FFT, but also modeling symmetrical and acausal impulse responses is another example. (and it doesn't work for 2-dim arrays.) – robert bristow-johnson Jun 09 '16 at 19:48
  • this is **not** guaranteed to work, but it **has** worked in every development environment i have used: `mytype _array_left[N], array[1], _array_right[N]; for(n=-N; n<=N; n++) array[n] = n;` if we could be assured that the compiler allocates the sequentially-declared variables adjacently, this would be guaranteed to work and be safe. *(in both comments i should have* `typedef` *ed* `mytype` *to be some signed integer type. forgot to say that.)* – robert bristow-johnson Jun 09 '16 at 19:53
  • 2
    @robertbristow-johnson: Even if the compiler allocates arrays within a struct sequentially without gaps, that does not guarantee that indexing one of the arrays is guaranteed to affect another. See https://godbolt.org/g/Avt3KW for an example of how gcc will treat such code. – supercat Jun 09 '16 at 22:06
  • i am not sufficiently familiar with the asm of whatever processor shown. if the compiler allocates arrays declared adjacently, and because C does not bounds check, i cannot see any possibility of one array not being indexed from the other. there are issues with even (2x) and 4x byte alignment. but i am not considering that. nonetheless, C does not guarantee that adjacent declared arrays are allocated in adjacent memory anyway. – robert bristow-johnson Jun 09 '16 at 23:50
  • 1
    @robertbristow-johnson: I commented the assembly to explain what it's doing. The compiler sees that code stores 1 into s->arr2[0] and then increments s->arr2[0], so gcc combines those two operations by having code simply store the value value 2, without considering the possibility that the intervening write to s->arr1[i] might affect the value of s->arr1[0] (since, according to the Standard, it can't). – supercat Jun 10 '16 at 05:15
  • @supercat Sure, nothing stops the compiler from doing the multiply. That's optimisation. But you propose non-causal behaviour, because printing "QUACK" happens *before* the multiply, and that violates sequence points. If you're arguing that "undefined behaviour" includes "randomly jump to any area of code", then sure, that could happen. But it is categorically wrong to suggest that this is how it *should* behave. In that case, any compiler which encounters undefined behaviour "should" just as legitimately run an infinite loop printing 'wibble'. It's equally unrelated to the source code. – Graham Jun 10 '16 at 09:46
  • 1
    @Graham In my programming classes this was made very clear with examples like "send nukes to China": undefined behavior means **anything** can happen - there are absolutely no guarantees of behavior, including so-called "non-causal" behavior. Indeed printing 'wibble' is also valid. The reason it is not usually done is because it is not usually worthwhile to do so - supercat has given an example where it can legitimately appear as a side effect of super-optimization. – Mario Carneiro Jun 10 '16 at 12:08
  • @MarioCarneiro There's a big difference between "anything *could* happen" and "compiler theory says this *should* happen". I don't disagree with "anything could happen" - in practise it won't for an integer overflow, but in principle there's no guarantee, sure. But I completely disagree with supercat's assertion that according to the best theory on how to design a compiler, this is the **right** thing to do. – Graham Jun 10 '16 at 13:42
  • @Graham: I think it's a *horrible* thing to do, but it has become fashionable and programmers need to be aware of it. For some reason, when compilers got sophisticated enough to make reverse-causal inferences, compiler writers decided that rather than adding directives via which programmers could invite compilers to make inferences, they should instead make use of situations where the Standards impose no requirements (even if all mainstream implementations for mainstream CPUs would have behaved in usefully-consistent fashion). – supercat Jun 10 '16 at 14:03
  • 1
    @Graham: Many programs run in scenarios where, when given invalid (possibly malicious) input, it would be acceptable to produce meaningless output or trap, but not acceptable for to allow malicious input to invoke arbitrary code. If a value would need to be bounds-checked at one location in machine code to guard against malicious code execution (and the programmer knows where that is), but would need to be checked twenty times in source to prevent all forms of UB, having a compiler decide that if the programmer doesn't validate the input in all twenty places it should omit the checks from... – supercat Jun 10 '16 at 14:43
  • 1
    ...the places the programmer *did* write them doesn't seem like a recipe for producing code which is efficient *or* safe, since the modern compiler will only offer a choice between generating code with 19 unnecessary checks and one useful one, or code which omits the vitally important check; generating machine code which includes only the check that would be necessary for the machine code to meet requirements is no longer an option. – supercat Jun 10 '16 at 14:46
  • 1
    @Graham: This [llvm blog post](http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html) about undefined behaviour explains some of the ways that compilers do take advantage of UB to make useful optimizations. There are valid reasons for this. e.g. indexing an array with an `int` loop counter on a 64bit machine would require a sign-extension of the loop counter every iteration without this. I think we might need a new language to replace C; better designed to optimize well to good asm, but without needing to be so programmer-hostile to do so. I'm hoping Rust achieves that. – Peter Cordes Jun 11 '16 at 03:29
  • @PeterCordes: All that's needed would be to have a means via which programs can specify that they require features or behaviors beyond what implementations would be required to provide, with semantics that a conforming implementations would be allowed to reject any program that specifies features or behaviors beyond the Standard's minimum requirements *but* would be required to honor the stated requirements of any program it does not reject. – supercat Jun 11 '16 at 22:39
  • 1
    @PeterCordes: I would suggest that in most cases the most useful level of guarantees with regard to overflow would be to say that integer overflow has no side-effects, but the result may behave non-deterministically as a number which is outside the range of the type, but whose "lower bits" would be correct. That would avoid the need for compilers to use sign-extension instructions, but at the same time would mean that programmers could let overflows happen in cases where the looser semantics would still meet requirements. For example, if one is trying to find objects of interest... – supercat Jun 11 '16 at 22:42
  • ...and knows that for any object of interest, `x*y < z` will yield 1, and the calculation might overflow for objects that aren't of interest, but never for objects that are, being able to let the computation overflow and be prepared to handle "false positives" may be more efficient than having to either use a longer type so it can't overflow or use an unsigned type and force the compiler to make it wrap. The only way compilers will be able to generate optimal code in all contexts will be if programmers can afford to let the overflows happen in cases where they don't care about the result. – supercat Jun 11 '16 at 22:46
  • @supercat: Hmm, yes, that might work. I was about to post a comment about the very common idiom of loops with `int` loop counters indexing arrays, when the loop end condition is an `int` the compiler doesn't know anything about. (i.e. might be negative). With `-fwrapv`, gcc has to use 32bit ops on the counter and `movsx` (sign-extend) it before using it as an index. – Peter Cordes Jun 11 '16 at 22:48
  • *The only way compilers will be able to generate optimal code in all contexts will be if programmers can afford to let the overflows happen in cases where they don't care about the result*. Ya, that's the approach Rust takes. If a calculation might wrap, [you use `x.wrapping_add(y)`](https://doc.rust-lang.org/std/primitive.i32.html#method.wrapping_add). I think it's an error if `x+y` does ever wrap (debug builds can check for this), but I haven't really learned Rust yet. – Peter Cordes Jun 11 '16 at 22:51
  • @PeterCordes: My philosophy is that the reason C got a reputation for speed is that in cases where the edge-case behavior would meet requirements without any edge-handling logic in the machine code, neither the programmer nor compiler had to generate edge-handling logic; "optimizations" become counter-productive if they make it necessary for programmers to add additional logic that would otherwise not have been needed to meet requirements. BTW, one thing I'd really like to see in C would be checked integer types with loose checking semantics: integer overflow would set a flag [perhaps errno] – supercat Jun 11 '16 at 22:53
  • ...in cases where it might cause the program to produce an arithmetically-incorrect result, and would be allowed *but not required* to set the flag in cases where an overflow occurred but could not affect results. Overflow-trapping logic in user code is often very hard to optimize because a calculation which might set a user-code overflow flag can't be optimized out *even if the calculation would otherwise be ignored*. if the overflow flag were handled by the compiler, it could know that in cases where the result of a calculation is ignored, the overflow flag can be too. – supercat Jun 11 '16 at 22:57
5

Historical reasons. I don't often get to write brand new code, mostly I get to maintain and extend the old stuff which has been running for decades. I'm just happy it's C and not Fortran.

I can get irritated when some student says, "but why on earth do you do this awful X when you could be doing Y?". Well, X is the job I've got and it pays the bills very nicely. I have done Y on occasion, and it was fun, but X is what most of us do.

RedSonja
  • 297
  • 2
  • 4
5

What is "dangerous"?

The claim that C is "dangerous" is a frequent talking point in language flame wars (most often in comparison to Java). However, the evidence for this claim is unclear.

C is a language with a particular set of features. Some of these features may allow certain types of errors that are not allowed by other types of languages (the risk of C's memory management are typically highlighted). However, this is not the same as an argument that C is more dangerous than other languages overall. I'm not aware of anyone providing convincing evidence on this point.

Also, "dangerous" depends on context: what are you trying to do, and what kinds of risks are you worried about?

In many contexts I would consider C more "dangerous" than a high-level language, because it requires you to do more manual implementation of basic functionality, increasing the risk of bugs. For example, doing some basic text processing or developing a website in C would usually be dumb, because other languages have features that make this a lot easier.

However, C and C++ are widely used for mission-critical systems, because a smaller language with more direct control of hardward is considered "safer" in that context. From a very good Stack Overflow answer:

Although C and C++ were not specifically designed for this type of application, they are widely used for embedded and safety-critical software for several reasons. The main properties of note are control over memory management (which allows you to avoid having to garbage collect, for example), simple, well debugged core run-time libraries and mature tool support. A lot of the embedded development tool chains in use today were first developed in the 1980s and 1990s when this was current technology and come from the Unix culture that was prevalent at that time, so these tools remain popular for this sort of work.

While manual memory management code must be carefully checked to avoid errors, it allows a degree of control over application response times that is not available with languages that depend on garbage collection. The core run time libraries of C and C++ languages are relatively simple, mature and well understood, so they are amongst the most stable platforms available.

  • 2
    I'd say hyper-modern C is also more dangerous than assembly language or genuine low-level dialects of C which consistently behave as though they consistently translate C operations into machine code operations without regard for edge-cases where the natural machine code operations would have defined behavior but the C Standard would impose no requirements. The hyper-modern approach where an integer overflow can negate the rules of time and causality seems far less amenable to generation of safe code. – supercat Jun 10 '16 at 18:09
5

To add to the existing answers, it's all well and good saying that you're going to choose Python or PHP for your project, because of their relative safety. But somebody's got to implement those languages and, when they do, they are probably going to do it in C. (Or, well, something like it.)

So that's why people use C — to create the less dangerous tools that you want to use.

Lightness Races in Orbit
  • 8,755
  • 3
  • 41
  • 45
2

Allow me to rephrase your question:

I am considering learning [tool].

But why do people use [tool] (or [related tool]) if [they] can be used 'dangerously'?

Any interesting tool can be used dangerously, including programming languages. You learn more so you can do more (and so that less danger is created when you use the tool). In particular, you learn the tool so that you can do the thing that tool is good for (and perhaps recognize when that tool is the best tool of the tools you know).

For instance, if you need to put a 6 mm diameter, 5 cm deep, cylindrical hole in a block of wood, a drill is a much better tool than an LALR parser. If you know what these two tools are, you know which is the right tool. If you already know how to use a drill, voila!, hole.

C is just another tool. It's better for some tasks than for others. The other answers here address this. If you learn some C, you will come to recognize when it is the right tool and when it is not.

Eric Towers
  • 129
  • 5
  • This sot of answer is why questions get thrown out as "primarily opinion-based". Don't say that C has its advantages, say what they are! – reinierpost Jun 09 '16 at 15:50
1

I am considering learning C

There is no specific reason not to learn C but I would suggest C++. It offers much of what C does (since C++ is a super set of C), with a large amount of "extras". Learning C prior to C++ is unnecessary -- they are effectively separate languages.

Put another way, if C were a set of woodworking tools, it would likely be:

  • hammer
  • nails
  • hand saw
  • hand drill
  • block sander
  • chisel (maybe)

You can build anything with these tools -- but anything nice potentially requires a lot of time and skill.

C++ is the collection of power tools at your local hardware store.

If you stick with basic language features to start, C++ has relatively little additional learning curve.

But why do people use C (or C++) if it can be used 'dangerously'?

Because some people don't want furniture from IKEA. =)

Seriously though, while many languages that are "higher" than C or C++ may have things that make them (potentially) "easier" to use in certain aspects, this isn't always a good thing. If you don't like the way something is done or a feature isn't provided, there likely isn't much you can do about it. On the other hand, C and C++ provide enough "low-level" language features (including pointers) that you can access many things fairly directly (esp. hardware or OS-wise) or build it yourself, which may not be possible in other languages as implemented.

More specifically, C has the following set of features that make it desirable for many programmers:

  • Speed - Because of it's relative simplicity and compiler optimizations over the years, it is natively very fast. Also, a lot of people have figured out a lot of shortcuts to specific goals when using the language, which makes it potentially even faster.

  • Size - For similar reasons as the ones listed for speed, C programs can be made very small (both in terms of executable size and memory usage), which is desirable for environments with limited memory (i.e embedded or mobile).

  • Compatibility - C has been around for a long time and everyone has tools and libraries for it. The language itself is not picky either - it expects a processor to execute instructions and memory to hold stuff and that is about it.

    Furthermore, there is something known as an Application Binary Interface (ABI). In short, it is a way for programs to communicate on a machine-code level, which can have advantages over an Application Programming Interface (API). While other languages such as C++ can have an ABI, typically these are less uniform (agreed upon) than C's, so C makes a good foundation language when you want to use an ABI to communicate with another program for some reason.

Why do programmers not just use Java or Python or another compiled language like Visual Basic?

Efficiency (and occasionally memory management schemes that cannot be implemented without relatively direct access to memory).

Directly accessing memory with pointers introduces a lot of neat (usually quick) tricks when you can put your grubby paws on the little ones and zeros in your memory cubbyholes directly and not have to wait for that mean ol' teacher to hand out the toys just at playtime then scoop them up again.

In short, adding stuff potentially creates lag or otherwise introduces unwanted complexity.

Regarding scripted languages and that ilk, you have to work hard to get languages requiring secondary programs to run as efficiently as C (or any compiled language) natively does. Adding an on-the-fly interpreter inherently introduces the possibility for decreased execution speed and increased memory usage because you are adding another program to the mix. Your programs efficiency relies as much on the efficiency of this secondary program as how well (poorly =) ) you wrote your original program code. Not to mention your program is often completely reliant on the second program to even execute. That second program doesn't exist for some reason on a particular system? Code no go.

In fact, introducing anything "extra" potentially slows or complicates your code. In languages "without scary pointers", you are always waiting for other bits of code to clean up behind you or otherwise figure out "safe" ways to do things - because your program is still doing the same memory access operations as might be done with pointers. You just aren't the one handling it (so you can't f*ck it up, genius =P ).

By dangerous, I mean with pointers and other similar stuff. [...] Like the Stack Overflow question Why is the gets function so dangerous that it should not be used?

Per the accepted answer:

"It remained an official part of the language up to the 1999 ISO C standard, but it was officially removed by the 2011 standard. Most C implementations still support it, but at least gcc issues a warning for any code that uses it."

The notion that because something can be done in a language, it must be done is silly. Languages have flaws that get fixed. For compatibility reasons with older code, this construct can still be used. But there is nothing (likely) forcing a programmer to use gets() and, in fact, this command was essentially replaced with safer alternatives.

More to the point, the issue with gets() isn't a pointer issue per se. It's a problem with a command that doesn't necessarily know how to use memory safely. In an abstract sense, this is all pointer issues - reading and writing stuff your not supposed to. That isn't a problem with pointers; it's a problem with pointer implementation.

To clarify, pointers aren't dangerous until you accidentally access a memory location that you weren't intending to. And even then that doesn't guarantee your computer will melt or explode. In most cases, your program will just cease to function (correctly).

That said, because pointers provide access to memory locations and because data and executable code exist in memory together, there is enough of a real danger of accidental corruption that you want to manage memory correctly.

To that point, because truly direct memory access operations often provide less benefit in general than they might have years ago, even non-garbage collected languages like C++ have introduced things such as smart pointers to help bridge the gap between memory efficiency and safety.

In summary, there is very little reason to fear the pointer as long as it's used safely. Just take a hint from South Park's version of Steve "The Crocodile Hunter" Irwin -- don't go around sticking your thumb in crocs' bumholes.

Anaksunaman
  • 127
  • 4
  • 2
    I don't agree with the suggestion to learn C++ instead of C. Writing good C++ is harder than writing good C and reading C++ is much harder than reading C. So the learning curve of C++ is much steeper. "C++ is a super set of C" This is more or less like saying that boots are a superset of slippers. They have different advantages and usage and each one has features that the other doesn't. – martinkunev Jun 10 '16 at 17:50
  • "Writing good C++ is harder than writing good C" - Absolutely. =) "[R]eading C++ is much harder than reading C" - Any advanced programming is likely indistinguishable from magic ;-) My two cents is that this is much more programmer dependent than language dependent, though C++ does nothing much to help itself in this category. "So the learning curve of C++ is much steeper." - In the long run, yes. In the short term, less so (my opinion). Anecdotally, most basic language courses in C and C++ are likely to cover roughly the same general types of material, excepting classes for C++. – Anaksunaman Jun 11 '16 at 14:33
  • 2
    "They have different advantages and usage and each one has features that the other doesn't." - As mentioned "There is no specific reason not to learn C[.]" C is a fine language and I stick by that. If it suits OP or anyone else, I fully support learning it. =) – Anaksunaman Jun 11 '16 at 14:39
  • Learning C teaches you how the machine works (not all the way down, but still a step closer to the metal). That's a very good reason for learning it. – Agent_L Jun 13 '16 at 09:01
1

As always, programming language is only a consequence of problem solving. You should in fact learn not just C but many different languages (and other ways of programming a computer, be it GUI tools or command interpreters) to have a decent toolbox to use when solving problems.

Sometimes you will find that a problem lends itself well to something that is included in the Java default libraries, in such a case you may choose Java to leverage that. In other cases it may be that you need to do something on Windows that is a lot simpler in the .NET runtime, so you may use C# or VB. There could be a graphical tool or command script that does solve your problem, then you may use these. Maybe you need to write a GUI application on multiple platforms, Java could be an option, given the included libraries in the JDK, but then again, one target platform may lack a JRE so maybe you instead choose C and SDL (or similiar).

C has an important position in this toolset, as it is general, small and fast and also compiles to machinecode. It is also supported on every platform under the sun (not without recompile however).

Bottom line is, you should learn as many tools, languages and paradigms as you possibly can.

Please get away from the mindset: "I am a X programmer" (X=C, C++, Java, etc.)

Just use "I am a programmer".

A programmer solves problems and designs algorithms by instructing machines to perform the workload. End of story. This is irrelevant to the language. Your most important skill is problem solving and logical breakdown of structured problems, language skill/choice is ALWAYS secondary and/or a consequence of the nature of the problem.

An interesting path if you are interested in C is to extend your skillset with Go. Go is really an improved C, with garbage collection and interfaces, as well as a nice built in threading model/channels, that also bring many of the benefits of C (such as pointer arithmetic and compiling to machine code).

0

It depends on what you intend to do with it. C was designed as a replacement for assembly language and is the high level language that is closest to the machine language. Thus it has low overheads in size and performance and is suitable for systems programming and other tasks that require a small footprint and getting close to the underlying hardware.

0

When you're working at the level of bits and bytes, of memory as raw homogeneous collection of data, as would often be required to effectively implement the most efficient allocators and data structures, there is no safety to be had. Safety is predominantly a strong data type-related concept, and a memory allocator doesn't work with data types. It works with bits and bytes to pool out with those same bits and bytes potentially representing one data type one moment and another later on.

It doesn't matter if you use C++ in that case. You'd still be sprinkling static_casts all over the code to cast from void* pointers and still working with bits and bytes and just dealing with more hassles related to respecting the type system in this context than C which has a much simpler type system where you're free to memcpy bits and bytes around without worrying about bulldozing over the type system.

In fact it's often harder to work in C++, an overall safer language, in such low-level contexts of bits and bytes without writing even more dangerous code than you would in C, since you could be bulldozing over C++'s type system and doing things like overwriting vptrs and failing to invoke copy constructors and destructors at appropriate times. If you take the proper time to respect these types and use placement new and manually invoke dtors and so forth, you then get exposed to the world of exception-handling in a context too low-level for RAII to be practical, and achieving exception-safety in such a low-level context is very difficult (you have to pretend like just about any function can throw and catch all possibilities and roll back any side effects as an indivisible transaction as though nothing happened). The C code can often "safely" assume that you can treat any data type instantiated in C as just bits and bytes without violating the type system and invoking undefined behavior or running into exceptions.

And it would be impossible to implement such allocators in languages that don't allow you to get "dangerous" here; you'd have to lean on whatever allocators they provide (implemented most likely in C or C++) and hope it is good enough for your purposes. And there is almost always more efficient but less general allocators and data structures suitable for your specific purposes but much more narrowly applicable since they're specifically tailored for your purposes.

Most people don't need the likes of C or C++ since they can just call code originally implemented in C or C++ or possibly even assembly already implemented for them. Many might benefit from innovating at the high-level, like stringing together an image program that just uses libraries of existing image processing functions already implemented in C where they're not innovating so much at the lowest level of looping through individual pixels, but maybe offering a very friendly user interface and workflow never seen before. In that case, if the point of the software is just to make high-level calls into low-level libraries ("process this entire image for me, not for each pixel, do something"), then it might arguably be a premature optimization to even attempt to start writing such an application in C.

But if you're doing something new at the low level where it helps to access data in a low-level way like a brand new image filter never seen before that's fast enough to work on HD video in realtime, then you generally have to get a little bit dangerous.

It's easy to take this stuff for granted. I remember a facebook post with someone pointing out how it's feasible to create a 3D video game with Python with the implication that low-level languages are becoming obsolete, and it was certainly a decent-looking game. But Python was making high-level calls into libraries implemented in C to do all the heavy-lifting work. You can't make Unreal Engine 4 by just making high-level calls into existing libraries. Unreal Engine 4 is the library. It did all kinds of things that never existed in other libraries and engines from lighting to even its nodal blueprint system and how it can compile and run code on the fly. If you want to innovate at the kind of low engine/core/kernel level, then you have to get low-level. If all game devs switched to high-level safe languages, there would be no Unreal Engine 5, or 6, or 7. It would likely be people still using Unreal Engine 4 decades later because you can't innovate at the level required to come out with a next-gen engine by just making high-level calls into the old one.