77

Ever tried to sum up all numbers from 1 to 2,000,000 in your favorite programming language? The result is easy to calculate manually: 2,000,001,000,000, which some 900 times larger than the maximum value of an unsigned 32bit integer.

C# prints out -1453759936 - a negative value! And I guess Java does the same.

That means there are some common programming languages which ignore Arithmetic Overflow by default (in C#, there are hidden options for changing that). That's a behavior which looks very risky to me, and wasn't the crash of Ariane 5 caused by such an overflow?

So: what are the design decisions behind such a dangerous behavior?

Edit:

The first answers to this question express the excessive costs of checking. Let's execute a short C# program to test this assumption:

Stopwatch watch = Stopwatch.StartNew();
checked
{
    for (int i = 0; i < 200000; i++)
    {
        int sum = 0;
        for (int j = 1; j < 50000; j++)
        {
            sum += j;
        }
    }
}
watch.Stop();
Console.WriteLine(watch.Elapsed.TotalMilliseconds);

On my machine, the checked version takes 11015ms, while the unchecked version takes 4125ms. I.e. the checking steps take almost twice as long as adding the numbers (in total 3 times the original time). But with the 10,000,000,000 repetitions, the time taken by a check is still less than 1 nanosecond. There may be situation where that is important, but for most applications, that won't matter.

Edit 2:

I recompiled our server application (a Windows service analyzing data received from several sensors, quite some number crunching involved) with the /p:CheckForOverflowUnderflow="false" parameter (normally, I switch the overflow check on) and deployed it on a device. Nagios monitoring shows that the average CPU load stayed at 17%.

This means that the performance hit found in the made-up example above is totally irrelevant for our application.

Bernhard Hiller
  • 1,953
  • 1
  • 12
  • 17
  • 20
    just as a note, for C# you can use `checked { }` section to mark the parts of the code that should perform Arithmetic Overflow checks. This is due to performance – Paweł Łukasik May 08 '17 at 11:09
  • 15
    "Ever tried to sum up all numbers from 1 to 2,000,000 in your favorite programming language?" – Yes: `(1..2_000_000).sum #=> 2000001000000`. Another one of my favorite languages: `sum [1 .. 2000000] --=> 2000001000000`. Not my favorite: `Array.from({length: 2000001}, (v, k) => k).reduce((acc, el) => acc + el) //=> 2000001000000`. (To be fair, the last one is cheating.) – Jörg W Mittag May 08 '17 at 11:28
  • 6
    @JörgWMittag Well, when your favorite languages use 64bit integers, you'll need to sum up to some bigger number (or better multiply the numbers) to see how they deal with arithmetic overflow. – Bernhard Hiller May 08 '17 at 12:46
  • 27
    @BernhardHiller `Integer` in Haskell is arbitrary-precision, it will hold any number as long as you don't run out of allocatable RAM. – Polygnome May 08 '17 at 13:27
  • @harold would "Why do some languages not treat integer overflow as an error" be better? – Captain Man May 08 '17 at 15:03
  • 51
    The Ariane 5 crash was caused by checking for an overflow that didn't matter - the rocket was in a part of the flight where the result of a calculation wasn't even needed any more. Instead, the overflow was detected, and that caused the flight to abort. – Simon B May 08 '17 at 15:16
  • 2
    @BernhardHiller: My favorite language uses Integers. There will be no calculation on Integers that will lead to an overflow, ever. – Jörg W Mittag May 08 '17 at 15:27
  • 9
    `But with the 10,000,000,000 repetitions, the time taken by a check is still less than 1 nanosecond.` that's an indication of the loop being optimized out. Also that sentence contradicts previous numbers which appear very valid to me. – usr May 08 '17 at 15:40
  • 2
    @usr I'm fairly sure that the OP meant *per check* rather than for the checking overall. – Andrew Morton May 08 '17 at 17:56
  • 5
    I note also that `checked` is the default in C# when all the operands are constants. – Eric Lippert May 08 '17 at 19:38
  • Some additional links for people to consider. (1) https://blog.regehr.org/archives/1384 (2) https://danluu.com/integer-overflow/ (3) http://huonw.github.io/blog/2016/04/myths-and-legends-about-integer-overflow-in-rust/ – rwong May 08 '17 at 19:49
  • @EricLippert The fact that it is applied to constants is, IMHO, [an astonishment](https://en.wikipedia.org/wiki/Principle_of_least_astonishment), because of the mere fact that it is applied to constants and not to other things. Meanwhile, it would be more useful to mention [the project-wide checked arithmetic switch](http://stackoverflow.com/questions/4878548/c-sharp-overflow-not-working-how-to-enable-overflow-checking) in the context of this question. Just in case someone tried that switch and find out that things are broken, that in itself is an answer to this question: legacy code. – rwong May 08 '17 at 19:56
  • 6
    "In C# there are hidden options" - only hidden for those too i******t to bother learning the language. The specs are pretty clear. Hidden in plain sight. Hidden in such a way that this is a good question to ask a for a junior developer position. – TomTom May 09 '17 at 07:42
  • 5
    [Java8](https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html#addExact-int-int-) has methods for overflow checking. `a = Math.addExact(b, c)` looks _really_ clumsy to me compared to `a = b + c` but it does exist. – JollyJoker May 09 '17 at 07:58
  • 3
    If some function takes 4 seconds to execute and performs just one integer addition in all that time, then the added cost of checking that addition for overflow is negligible. But if a function performs some operation billions of times, and you increase the cost of that operation by a factor of 3, that's expensive. – David K May 09 '17 at 12:57
  • 1
    @David Checking for integer overflow on x86 is incredibly cheap. Far from factor 3 of anything even assuming a very conservative specification. The situations where this would actually make a difference are so incredibly rare that not having it as a default (that you can disable in your inner vectorized loop) is just bad planning or premature optimisation. – Voo May 09 '17 at 20:36
  • 3
    @Voo The question demonstrated a slowdown by nearly a factor of 3 (11 seconds vs. 4) in an application that made heavy use of integer arithmetic. The lack of _any_ check in C (and the default non-checking in C#) may be due more to history than to best practices with today's compiler technology, but the slowdown claimed as justification for those decisions is based on evidence, not on an unrealistic estimate. – David K May 09 '17 at 20:55
  • @David The given code is not really representative of actual programs. Also the only thing this shows if you look at the generated code is that since nobody uses checked code nobody has spent any time in the CLR optimising for it. Seriously, the generated code is just plain abysmal (although the unchecked code is pretty disappointing as well - C# just isn't a language you want to do high performance math in). I mean it generates `xor edx, edx; lea eax, [edx+1]` (not making this up). If checked code was the default the generated code would look a great deal different. – Voo May 09 '17 at 21:22
  • 2
    @Voo Of course the benchmark is artificial, but half a lifetime ago, when I still wrote C, integer arithmetic was most of what I wrote, much like the benchmark but with a lot more lines of code. As I already agreed, we _could_ do better. It's a very legitimate question to ask why we don't. (But I see an answer was just posted saying that Swift _does._) The only reason I commented at all was because the tone of the question seemed a bit too dismissive of performance concerns. – David K May 09 '17 at 21:31
  • @EricLippert Exactly such constant assignments like `int ERROR_1234 = 0x800D1234;` which require an `unchecked` were the reason why I used to take arithmetic overflow checking in C# for granted (till I stumbled upon some "unexpected behavior" some 3 or 4 years ago...). – Bernhard Hiller May 10 '17 at 08:10
  • 2
    @Voo The majority of processors in existence are not PCs - they're controlling consumer equipment where saving a few cents per item is significant. If your processing loop takes less time to run then you may be able to use a lower-spec processor, and that directly improves your profit margin. Alternatively for my personal situation, I've spent my career writing software for the electronics to control real physical things (notably car engines and national grid systems), and again there a 3x slow-down on my control loop is not acceptable. – Graham May 10 '17 at 10:17
  • 1
    “Ever tried to sum up all numbers from 1 to 2,000,000 in your favorite programming language?” — Who _hasn’t?!_ – Paul D. Waite May 10 '17 at 15:14
  • @rwong: Given that integer literals are not as conspicuously typed as variables, I would consider it more astonishing that changing `5*10000000000` to `4*1000000000` would silently result in a compiler using a value of -294967296, than for a compiler to squawk at the latter expression and require it to be written as `unchecked(4*1000000000)` if the programmer actually wants the latter meaning. – supercat May 10 '17 at 15:36
  • 1
    I thought this question was going to be about Arithmetic Stack Overflow and why everyone posts on Mathematics Stack Overflow instead. – CJ Dennis May 11 '17 at 02:45
  • 1
    At the risk of starting a language flame war but; my favourite language is Python and python has arbitrary length integers. I think the OP 'favourite language' statement needs some (maybe a lot) of caveats. – Tony Suffolk 66 May 11 '17 at 07:35
  • The whitepaper for the *self* programming language advocates strong primitives (with checking) and shows that the performance is good. That didn’t catch on, though. – JDługosz May 11 '17 at 09:19

14 Answers14

86

There are 3 reasons for this:

  1. The cost of checking for overflows (for every single arithmetic operation) at run-time is excessive.

  2. The complexity of proving that an overflow check can be omitted at compile-time is excessive.

  3. In some cases (e.g. CRC calculations, big number libraries, etc) "wrap on overflow" is more convenient for programmers.

Toby Speight
  • 550
  • 3
  • 14
Brendan
  • 3,895
  • 21
  • 21
  • 3
    The costs of run-time checking are high compared to he calculation proper, but I won't call that excessive. I'd prefer to use a `checked` statement (C# syntax, don't know for other languages) for special tasks like checksums. – Bernhard Hiller May 08 '17 at 12:50
  • 4
    there's no keyword for overflow checking in C and C++. It must be done manually by the programmer if needed, and sometimes can be done automatically with a compiler flag – phuclv May 08 '17 at 15:38
  • "wrap on overflow" functionality could still be accessible using appropriate type. `unsigned int` comes to mind. – Dmitry Grigoryev May 08 '17 at 16:56
  • 10
    @DmitryGrigoryev `unsigned int` shouldn't come to mind because a language with overflow checking should be checking *all* integer types by default. You should have to write `wrapping unsigned int`. – user253751 May 08 '17 at 21:07
  • 34
    I don't buy the cost argument. The CPU does check overflow on EVERY SINGLE integer calculation and set the carry flag in the ALU. It's the programming language support that's missing. A simple `didOverflow()` inline function or even a global variable `__carry` that allow access to the carry flag would cost zero CPU time if you don't use it. – slebetman May 09 '17 at 02:24
  • 2
    @Bernhard It would be better to do the checks by default and have an `unchecked` block for times when the programmer wants to turn them off. – David Conrad May 09 '17 at 08:30
  • 39
    @slebetman: That's x86. ARM does not. E.g. `ADD` doesn't set the carry (you need `ADDS`). Itanium doesn't even _have_ a carry flag. And even on x86, AVX doesn't have carry flags. – MSalters May 09 '17 at 08:40
  • 3
    @slebetman MIPS doesn't have any kind of flags. The user has to calculate overflow or carry flag manually if he wants to use it. That makes bigint maths on MIPS a little painful – phuclv May 09 '17 at 10:05
  • 31
    @slebetman It sets the carry flag, yes (on x86, mind you). But then you have to read the carry flag and decide on the result - that's the expensive part. Since arithmetic operations are often used in loops (and tight loops at that), this can easily prevent many safe compiler optimizations that can have a very big impact on performance even if you only needed one extra instruction (and you need a lot more than that). Does it mean it should be the default? Maybe, especially in a language like C# where saying `unchecked` is easy enough; but you might be overestimating how often overflow matters. – Luaan May 09 '17 at 10:15
  • @immibis It doesn't really matter how the type is called, my point is that the wrap functionality can still be easily accessible with overflow checking enabled. I was only giving an example from `C`, not suggesting a good name. – Dmitry Grigoryev May 09 '17 at 10:53
  • 12
    ARM's `adds` is the same price as `add` (it's just a instruction 1-bit flag that selects whether the carry flag is updated). MIPS's `add` instruction traps on overflow - you have to *ask* to not trap on overflow by using `addu` instead! – user253751 May 09 '17 at 10:58
  • 2
    x86 sets the carry flag... and powerful modern compilers like GCC will recognize standard overflow-checking idioms and will *use* the carry flag on platforms where it exists to "do what you mean" regardless of whether C and friends expose it at a language level. Try `unsigned int add_checked(unsigned int x, unsigned int y) { unsigned int z = x + y; return z < x ? 0 : z; }` with `-Os`. – Alex Celeste May 09 '17 at 18:11
  • 6
    @slebetman Not quite right. Modern CPUs have deep self-optimizing instruction pipelines. If you don't read carry, the CPU might not even perform the check! Even if it does, reading the flag results in data dependency between instructions, thwarting computation parallelism. A modern CPU has few full-featured cores, but each has lots of independant computation units, that parallelize operations within single-threaded execution. Unlike bounds checks (optimized out via prediction) carry checks *really do hurt* – user1643723 May 09 '17 at 18:16
  • 3
    @user1643723 Which hardware adder are you thinking of where the overflow bit is not an automatic consequence of the implementation? That strikes me as incredibly.. unlikely to exist. And that overflow branch is trivially easy to predict (otherwise something's clearly wrong) so won't have any appreciable effect on performance. Can you show a simple realistic code sample (c or assembly) without vectorisation (for that other rules apply) where adding overflow checks has a large performance impact? I can't imagine what that would be. – Voo May 09 '17 at 20:42
  • @Voo: My answer gives examples of situations where overflow checking is expensive. – supercat May 09 '17 at 22:42
  • @Voo "Which hardware adder are you thinking of where the overflow bit is not an automatic consequence of the implementation?" ­— no idea, maybe Intel's? See this patent: https://www.google.com/patents/US6049864. Of course, it is not the adder itself doing the magic, but rather the complex CPU machinery, optimizing it's operation to make use of fact that carry flag is not checked. And since the flag value is never observed, it might never be set in the first place… CPUs have limited ability to make such optimizations (less so, than compilers), but it is clearly not impossible. – user1643723 May 10 '17 at 11:20
  • @user1643723 In your abstract "The first instruction, when executed, generates a result **and intermediate flag generation data**". On first glance that patent deals with how to avoid unnecessary data dependencies when using the overflow bit from the ALU to set the flags register (and yes as I said there are many possible optimisations to deal with possible data dependencies - this being one of them). – Voo May 10 '17 at 12:26
  • 2
    (It's quite easy to see why the overflow flag is a trivial byproduct of any adder algorithm: you need some way to generalize a n-bit adder to a (n+1) adder and the overflow bit of your n-bit adder is necessary information for your n+1 bit adder). – Voo May 10 '17 at 12:27
  • @Voo: There are a number of processor platforms, including MIPS, which don't have CPU flags, and where the process required to perform two-word arithmetic is not efficiently extensible. Essentially, the logic for a two-word `a+=b` is `a.l+=b.l; temp= (a.l < b.l); temp+=b.h; a.h+=temp;` If `a` is 0x80000000FFFFFFFF and `b` is 0x7FFFFFFF00000001, then adding `b.h` to temp `temp` will "overflow" when setting it `0x80000000` and adding that to `a.h` will cause another "overflow" even though the addition actually yielded a correct sum of 0. – supercat May 10 '17 at 15:10
65

Who says it's a bad tradeoff?!

I run all of my production apps with overflow checking enabled. This is a C# compiler option. I actually benchmarked this and I was not able to determine the difference. The cost of accessing the database to generate (non-toy) HTML overshadows the overflow checking costs.

I do appreciate the fact that I know that no operations overflow in production. Almost all code would behave erratically in the presence of overflows. The bugs would not be benign. Data corruption is likely, security issues a possibility.

In case I need the performance, which is sometimes the case, I disable overflow checking using unchecked {} on a granular basis. When I want to call out that I rely on an operation not overflowing I might redundantly add checked {} to the code to document that fact. I am mindful of overflows but I don't necessarily need to be thanks to the checking.

I believe the C# team made the wrong choice when they chose to not check overflow by default but that choice is now sealed in due to strong compatibility concerns. Note, that this choice was made around the year 2000. Hardware was less capable and .NET did not have a lot of traction yet. Maybe .NET wanted to appeal to Java and C/C++ programmers in this way. .NET is also meant to be able to be close to the metal. That's why it has unsafe code, structs and great native call abilities all of which Java does not have.

The faster our hardware gets and the smarter out compilers get the more attractive overflow checking by default is.

I also believe that overflow checking is often better than infinitely sized numbers. Infinitely sized numbers have a performance cost that is even higher, harder to optimize (I believe) and they open up the possibility of unbounded resource consumption.

JavaScript's way of dealing with overflow is even worse. JavaScript numbers are floating point doubles. An "overflow" manifests itself as leaving the fully precise set of integers. Slightly wrong results will occur (such as being off by one - this can turn finite loops into infinite ones).

For some languages such as C/C++ overflow checking by default is clearly inappropriate because the kinds of applications that are being written in these languages need bare metal performance. Still, there are efforts to make C/C++ into a safer language by allowing to opt in into a safer mode. This is commendable since 90-99% of code tends to be cold. An example is the fwrapv compiler option that forces 2's complement wrapping. This is a "quality of implementation" feature by the compiler, not by the language.

Haskell has no logical call stack and no specified evaluation order. This makes exceptions occur at unpredictable points. In a + b it is unspecified whether a or b is evaluated first and whether those expressions terminate at all or not. Therefore, it makes sense for Haskell to use unbounded integers most of the time. This choice is suitable to a purely functional language because exceptions are really inappropriate in most Haskell code. And division by zero is indeed a problematic point in Haskells language design. Instead of unbounded integers they could have used fixed-width wrapping integers as well but that does not fit with the "focus on correctness" theme that the language features.

An alternative to overflow exceptions are poison values that are created by undefined operations and propagate through operations (like the float NaN value). That seems far more expensive than overflow checking and makes all operations slower, not just the ones that can fail (barring hardware acceleration which floats commonly have and ints commonly do not have - although Itanium has NaT which is "Not a Thing"). I also do not quite see the point of making the program continue to limp along with bad data. It's like ON ERROR RESUME NEXT. It hides errors but does not help get correct results. supercat points out that it's sometimes a performance optimization to do this.

usr
  • 2,734
  • 18
  • 15
  • 2
    Excellent answer. So what's your theory about why they decided to do it that way? Just copying everyone else who copied C and ultimately assembly and binary? – jpmc26 May 08 '17 at 23:32
  • 20
    When 99% of your user base expects a behavior, you tend to give it to them. And as for "copying C," it actually isn't a copy of C, but an extension of it. C guarantees exception free behavior for `unsigned` integers only. The behavior of signed integer overflow is actually undefined behavior in C and C++. Yes, *undefined behavior*. It just so happens that nearly everyone implements it as 2's complement overflow. C# actually makes it official, rather than leaving it UB like C/C++ – Cort Ammon May 08 '17 at 23:51
  • 12
    @CortAmmon: The language Dennis Ritchie designed had defined wraparound behavior for signed integers, but wasn't really suitable for use on non-two's-complement platforms. While allowing certain deviations from precise two's-complement wraparound can greatly assist some optimizations (e.g. allowing a compiler to replace x*y/y with x could save a multiplication and a division), compiler writers have interpreted Undefined Behavior not as an opportunity to do what makes sense for a given target platform and application field, but rather as an opportunity to throw sense out the window. – supercat May 09 '17 at 00:21
  • @supercat I am aware of one compiler which "threw sense out the window" regarding pointer overflow for one version, before public outcry forced them to undo the optimization. – Cort Ammon May 09 '17 at 00:28
  • 3
    @CortAmmon - Check the code generated by `gcc -O2` for `x + 1 > x` (where `x` is an `int`). Also see https://gcc.gnu.org/onlinedocs/gcc-6.3.0/gcc/Optimize-Options.html#index-fstrict-overflow-873 . 2s-complement behavior on signed overflow in C is *optional*, even in real compilers, and `gcc` defaults to ignoring it in normal optimization levels. – Jonathan Cast May 09 '17 at 05:06
  • @jpmc26 This certainly isn't "copied from assembly and binary" - I've done my share of assembly programming (though not on any really big project), and we were always quite explicit about where overflow checking is necessary and where it's a bug. C doesn't guarantee this behaviour either, though given how many people take undefined behaviour for granted, it may be one source of this. Of course, the vast majority of arithmetic operations that are running on your CPU don't need overflow checks, but that's often because they are in loops - it would be easy to just use `unchecked { ... }` there. – Luaan May 09 '17 at 10:22
  • 1
    @jcast: I don't have any objection to a compiler given `int x` optimizing `x+1 > x` as `1. To my mind it would be reasonable for quality compilers to treat the situation as analogous to floating-point math on implementations that define `FLT_EVAL_METHOD` as a negative number: temporary values of type `float` may be processed either as a `float`, or at any higher precision as the compiler sees fit, but casts to `float` must yield a value representable as that type. If code needs wrapping behavior, I would consider `(int)(x+1) > y` as more human-readable than `x+1 > y`... – supercat May 09 '17 at 14:33
  • ...and so I would not fault a compiler which arbitrarily evaluated `x+1` as `(long long)x+1LL` at its leisure, provided that `(int)(x+1) > y` would yield wrapping behavior in any case (note that `(int)(x+1LL)` would yield an Implementation-Defined value in the range of `int`, but on most systems the only practical definition would be wrapping). To my mind, a quality implementation should make it easy to get wrapping behavior, or easy to get behavior which isn't guaranteed to wrap but is guaranteed to stay on the rails (e.g. yield 0 or 1 in arbitrary fashion, with no side-effects). – supercat May 09 '17 at 14:38
  • @jcast: One thing compiler writers seem to miss is that a lot of code can get by with weak guarantees. Many applications have three requirements: "When given valid data, yield valid results; when given invalid data, yield arbitrary results; don't launch nuclear missiles in any case". In most cases the the third requirement should be the cheapest, but even in those cases "modern C" often makes it the most expensive one for a programmer to guarantee. – supercat May 09 '17 at 14:59
  • @jpmc26 I have added some notes on that. It is not a stupid mistake by them. – usr May 09 '17 at 15:19
  • 3
    @supercat Yeah, most C compiler writers are more interested in making sure some unrealistic benchmark runs 0.5% faster than trying to provide reasonable semantics to programmers (yes I understand why it's not an easy problem to solve and there are some reasonable optimisations that can cause unexpected results when combined, yada, yada but still it's just no focus and you notice it if you follow the conversations). [Luckily there are some people who try to do better](http://blog.regehr.org/archives/1496). – Voo May 09 '17 at 20:51
  • @Voo: I think you're misreading that. By my reading, the point is to allow more places where a compiler can exploit UB by allowing a compiler to hoist loop-invariant code that would invoke UB if a loop were executed at least once. To my mind, LLVM's treatment of undef should be replaced with a proper concept of non-deterministic values. If `x` is genuinely Indeterminate, then after `x &= 5` it should non-deterministically hold one of [0,1,4,5], and if that is followed by `x+=x` it should non-deterministically hold `[0,1,2,4,5,6,8,9,10]`. While that may sound like a lot of work... – supercat May 09 '17 at 22:17
  • ...for a compiler, the primary thing a compiler would gain from non-determinism is the ability to defer computations. Tweaking the example to use distinct variable names, `y=x&5; z=y+y;`, a compiler could evaluate `y` as 0,1, or 2, or as "x & 5". Likewise, `z` could be 0, 1, 2, 4, 5, 6, 8, 9, 10, (x&5), (x&5)+1, (x&5)+4, (x&5)+5, or (x&5)+(x&5). If each evaluation of `x` could yield an independent integer, each evaluation of `z` could yield a result independently chosen from the aforementioned set, but the set of possible values would be constrained. – supercat May 09 '17 at 22:27
  • @usr: I'm not familiar with Haskell, but I would think a functional language could accommodate an "integer NaN" type more efficiently than arbitrary-sized integers. – supercat May 09 '17 at 23:27
  • C/C++ signed integer overflow behaviour is implementation defined behaviour rather than undefined behaviour. It can be forced to 2's complement by passing the 'fwrapv' complier flag. – James May 10 '17 at 09:45
  • @supercat that's a nice idea. I amended my answer. – usr May 10 '17 at 10:19
  • If you are running an app which access a database to generate HTML, your code isn't the case which skipping the overflow check was meant to optimize, and certainly isn't the type of code the designers of C or of the x86 architecture were thinking about. – jwg May 10 '17 at 10:42
  • @jwg do we not agree on that? – usr May 10 '17 at 13:04
  • @usr: Requiring that a loop exit as soon as bad data is detected will preclude parallelism. If one will not be concerned about performance in the overflow case, but merely concerned with ensuring that bogus results don't get mistaken for good ones, NaN will work very nicely. It's also a lot cheaper than arbitrary-precision arithmetic and won't overload the system. – supercat May 10 '17 at 14:11
  • @James: It is behavior upon which the Standard imposed no requirements (so from the Standard's point of view it is UB), but the authors of the Standard expected commonplace implementations to define it in at least some cases whether the Standard required them to or not (read the rationale for C89 3.2.1.1, starting with the words "...and in even more cases" in the fourth paragraph, which observes describes situations where implementations would not be required to behave consistently, but commonplace implementations did anyhow). – supercat May 10 '17 at 14:37
  • @supercat good point, I added something about that. – usr May 10 '17 at 16:25
  • Just a tangent, but this is incorrect: " .NET wanted to appeal to Java programmers (that's why arrays with reference type elements are variant which is a curse)." It has been commented by members of the C# team that array reference types work the way they do because .NET 1.0 didn't support generics, and without generic support you essentially have to have arrays of reference types work like that, because otherwise there are important data structures that cannot be implemented. Java compatibility wasn't considered, that made the same decision for the same reason Java did previously. – Jules May 11 '17 at 08:58
  • @Jules super interesting, I edited to match that. – usr May 11 '17 at 09:34
  • @Jules: Even with generics, the only way to allow things like array sorting without requiring array quasi-covariance is to have arrays implement a type-agnostic interface (e,g. ` IPermutable`) with a method that can swap elements without requiring that they be given to or received back from the calling code. – supercat May 11 '17 at 19:56
  • @supercat - I think the theory is that if .nét 1 had generics, reference arrays would have been a special syntax for something like `System.Array`, so array sorting would have a type that looks like ` void Sort(Array array)`. In other words, arrays wouldn't be covariant, but methods operating on them would potentially have to be. – Jules May 12 '17 at 07:59
  • 1
    @Jules: Interfaces that can read out elements from a collection can be covariant, but not contravariant. Interfaces that can write elements into a collection can be contravariant but not covariant. Interfaces that can do both are invariant. I haven't seen any popular collection interfaces that can permute array items without reading or writing them, but such interfaces would be needed to facilitate generic routines that could sort, shuffle, or otherwise rearrange the elements in a collection. Note that a permutation interface could have a mehtod `swap(int x, int y)` without the caller... – supercat May 12 '17 at 14:24
  • 1
    ...having to know or care about the type of the elements involved. so the fact that an interface allowed such operations would not prevent it from being covariant or contravariant. – supercat May 12 '17 at 14:25
  • @supercat .... I guess I'm going to have to go and try it, but I don't get why either covariance or covariance in the array interface is required here, but just plain old subtyping. – Jules May 12 '17 at 15:45
  • 1
    @Jules: The basic issue is that it's helpful to have a function that can e.g. sort a list of `Animal` based on some criterion be likewise capable of sorting a list of `Cat` based on the same criterion. The notion of covariance says that code needing something from which it can read out `Animal`s should be able to accept something that can read out `Cat`s. Contravariance says that code needing something to which it can write `Cat` should be satisfied with something that can accept all objects of type `Animal`. – supercat May 12 '17 at 18:41
31

Because it's a bad trade-off to make all calculations a lot more expensive in order to automatically catch the rare case that an overflow does occur. It's much better to burden the programmer with recognizing the rare cases where this is an issue and add special preventions than to make all programmers pay the price for functionality that they don't use.

Kilian Foth
  • 107,706
  • 45
  • 295
  • 310
  • 28
    That's somehow like saying that checks for Buffer Overflow should be omitted because they hardly ever occur... – Bernhard Hiller May 08 '17 at 12:48
  • 74
    @BernhardHiller: and that's exactly what C and C++ do. – Michael Borgwardt May 08 '17 at 13:08
  • 6
    @MichaelBorgwardt C# (and Java, I guess), checks buffer overflows, but not integer overflows. Why? What makes one worth checking for, but not the other? – svick May 08 '17 at 17:13
  • 3
    @svick because buffer overflows lead to security vulnerabilities – David Brown May 08 '17 at 18:07
  • 14
    @DavidBrown: As do arithmetic overflows. The former do not compromise the VM though. – Deduplicator May 08 '17 at 18:41
  • 36
    @Deduplicator makes an excellent point. The CLR was carefully designed so that verifiable programs cannot violate *the invariants of the runtime* even when bad stuff happens. Safe programs can of course violate *their own* invariants when bad stuff happens. – Eric Lippert May 08 '17 at 19:37
  • 7
    @svick Arithmetic operations are probably far more common than array indexing operations. And most integer sizes are large enough that it's very rare to perform arithmetic that overflows. So the cost-benefit ratios are very different. – Barmar May 09 '17 at 00:07
  • 1
    @Barmar Also a good point. Especially given how common "off-by-one" errors are - very dangerous with indexing, much less so with arithmetic overflow. – Luaan May 09 '17 at 10:17
  • @Barmar: Use of larger data types will make overflows with valid data uncommon, but overflows can still easily be triggered by invalid data. When the only consequence from overflows caused by invalid data is the production of meaningless results, that's not really a problem. When "smart" compilers use UB as an excuse to bypass safety checks, that becomes a big problem. Given the requirments: 1. When given valid data, give valid results; 2. When given invalid data, produce arbitrary results; 3. Don't launch nuclear missiles in any case", the third requirement shouldn't be the most expensive. – supercat May 09 '17 at 23:33
  • @supercat I like to hope that the programmers of nuclear launch controllers are more careful than cash register programmers -- either using languages that do more automatic checks, or more stringent quality control of the software. Probably the latter, since the real-time nature of the application and the ancient H/W they're probably using may preclude automatic checks of every operation. – Barmar May 10 '17 at 14:24
  • @Barmar: A major requirement for such systems is the avoidance of "optimization". Even if the only way `x` could equal any value other than 23 would be if something like ionizing radiation flipped a bit in memory when it wasn't supposed to, code like `if (x != 23) abort_launch();` should be removed as redundant. I would regard techniques like UB-based dead-branch elimination as unsuitable for use in any kind of safely-related or security code, since in many cases such code exists to ensure that things that "can't" happen by any defined means, don't happen via some *other* means. – supercat May 10 '17 at 15:17
  • @supercat There are other techniques used in "failure is not an option" systems, such as the redundant, independently-implemented systems with voting used in space shuttles. – Barmar May 10 '17 at 15:20
  • @Barmar: For some such systems, that is true. There are many other systems, however, which rely upon software checks to guard against things which can't happen *by any defined means* but may happen anyway. In many cases, there will be ways for people attacking a system to make things happen that shouldn't be possible, and code that guards against such things will often be the difference between a thwarted attack and a successful breach. – supercat May 10 '17 at 15:27
  • @supercat Can you really depend on software checks to detect hardware failures? What if the failure occurs in the checking code? I think the ultimate answer is that there's only so much you can automate, and proper software engineering means understanding the tradeoffs. – Barmar May 10 '17 at 15:32
  • @Barmar: Software checks can detect failure modes that could not be practically detected in hardware. Whether or not they are sufficient in a given situation, a system that incorporates both hardware and software checks can be both cheaper and more robust than one which relies upon hardware alone, but only if a software checks are included in the actual machine code. – supercat May 10 '17 at 15:40
  • @supercat If you're worried about stray cosmic rays changing register/memory spontaneously, aren't most bets off? Isn't that why we have ECC? – Barmar May 10 '17 at 15:43
  • @Barmar: The point is to protect against a variety of things that can't happen via any defined means. Even if hardware guards against some particular unexpected events, software can protect against some others. If external hardware will release a clamp that's holding a heavy object if it receives a certain exact sequence of bytes, having software perform a multi-step computation to determine what bytes need to be sent, and perform various tests tests that everything else is okay will performing such computations, will render it extremely improbable that any combination of random memory... – supercat May 10 '17 at 17:36
  • ...glitches would result in the device sending out the sequence of bytes needed to release the clamp (especially if the required sequence appears nowhere within the actual code). A lot of software isn't so safety-critical as e.g. a NASA rocket controller, but still needs to be able to reasonably assure fail-safe behavior. Redundancy in such software is a *good* thing--not something compilers should try to eliminate. – supercat May 10 '17 at 17:41
20

what are the design decisions behind such a dangerous behavior?

"Don't force users to pay a performance penalty for a feature they may not need."

It's one of the most basic tenets in the design of C and C++, and stems from a different time when you had to go through ridiculous contortions to get barely adequate performance for tasks that are today considered trivial.

Newer languages break with this attitude for many other features, such as array bounds checking. I'm not sure why they didn't do it for overflow checking; it could be simply an oversight.

Michael Borgwardt
  • 51,037
  • 13
  • 124
  • 176
  • 18
    It's definitely not an oversight in the design of C#. The designers of C# deliberately created two modes: `checked` and `unchecked`, added syntax for switching between them locally and also command line switches (and project settings in VS) to change it globally. You might disagree with making `unchecked` the default (I do), but all this is clearly very deliberate. – svick May 08 '17 at 17:05
  • But the CPU hardware checks overflow by default and sets the carry flag in the ALU regardless of weather you want to check overflow or not. Performance is not the issue here. – slebetman May 09 '17 at 02:27
  • 8
    @slebetman - just for the record: the cost here is not the cost of checking for the overflow (which is trivial), but the cost of running different code depending on whether the overflow happened (which is very expensive). CPUs do not like conditional branch statements. – Jonathan Cast May 09 '17 at 05:07
  • 5
    @jcast Wouldn't branch prediction on modern processors almost eliminate that conditional branch statement penalty? After all the normal case should be no overflow, so it's very predictable branching behavior. – CodeMonkey May 09 '17 at 08:01
  • 4
    Agree with @CodeMonkey. A compiler would put in a conditional jump in case of overflow, to a page that's normally not loaded/cold. The default prediction for that is "not taken", and it probably will not change. Total overhead is one instruction in the pipeline. But that is one instruction overhead per arithmetic instruction. – MSalters May 09 '17 at 08:47
  • 2
    @MSalters yes, there is an additional instruction overhead. And the impact might be large if you have exclusively CPU bound problems. In most applications with a mix of IO and CPU heavy code I'd assume the impact is minimal. I like the Rust way, of adding the overhead only in Debug builds, but removing it in Release builds. – CodeMonkey May 09 '17 at 08:56
  • @CodeMonkey: On the other hand, those neural networks with a million weights might suffer a bit more. (And they want saturating adds, not wrap-around) – MSalters May 09 '17 at 09:02
  • 1
    Presumably if there actually had been demand for this in the languages of the time, then (for example) amd64 hardware would have found a way for the code to request a hardware exception on overflow, instead of cluttering the instruction pipeline with a conditional branch after every arithmetic op? But hardware and programming language design sit in a sort of feedback loop. – Steve Jessop May 09 '17 at 09:44
  • @SteveJessop The same functionality does exist for floating point numbers, though, doesn't it? So it certainly isn't that nobody considered this - they probably just thought it wasn't useful enough. Most arithmetic overflows are either desired or harmless (think of how many e.g. checksum calculations are there for every "important" calculation on your computer). This obviously isn't true for *your* code, but then neither is the performance requirement - you can afford to have checking by default even with the penalty, and only disabling it when necessary in a tight loop somewhere. – Luaan May 09 '17 at 10:29
  • @Luaan: exactly. By "actually been demand", I should have said *sufficient* demand. FWIW, there are approximately zero checksum calculations for each "important" calculation, since any time I have a loop counter I would, if it was free, like it to throw an exception on overflow. That's all the importance needed, so there's loads of "important" arithmetic going on all the time. Since it's not free I can live without. – Steve Jessop May 09 '17 at 13:51
  • 1
    @CodeMonkey Don't forget that branch prediction is a relatively recent feature of CPUs. Most of the languages we use today were designed decades ago, when they couldn't depend on this, so the cost of overflow checks for every arithmetic operation would have been significant -- possibly 10-25% overhead in many applications. – Barmar May 09 '17 at 15:22
  • There's at least one family of languages that automatically does multi-precision integers and rationals (as well as array bounds checking), so overflow is not a problem. But Lisp never caught on with the mainstream programming community, possibly because there was an assumption that all this safety cost too much in performance. – Barmar May 09 '17 at 15:25
19

Legacy

I would say that the issue is likely rooted in legacy. In C:

  • signed overflow is undefined behavior (compilers support flags to make it wrap),
  • unsigned overflow is defined behavior (it wraps).

This was done to get the best possible performance, following the principle that the programmer knows what it's doing.

Leads to Statu-Quo

The fact that C (and by extension C++) do not require the detection of overflow in turns means that overflow checking is sluggish.

Hardware mostly caters to C/C++ (seriously, x86 has a strcmp instruction (aka PCMPISTRI as of SSE 4.2)!), and since C doesn't care, common CPUs do not offer efficient ways of detecting overflows. In x86, you have to check a per-core flag after each potentially overflowing operation; when what you'd really want is a "tainted" flag on the result (much like NaN propagates). And vector operations may be even more problematic. Some new players may appear on the market with efficient overflow handling; but for now x86 and ARM do not care.

Compiler optimizers are not good at optimizing overflow checks, or even optimizing in the presence of overflows. Some academics such as John Regher complain about this statu-quo, but the fact is that when the simple fact of making overflows "failures" prevents optimizations even before the assembly hits the CPU can be crippling. Especially when it prevents auto-vectorization...

With cascading effects

So, in the absence of efficient optimization strategies and efficient CPU support, overflow-checking is costly. Much more costly than wrapping.

Add in some annoying behavior, such as x + y - 1 may overflow when x - 1 + y doesn't, which may legitimately annoy users, and overflow-checking is generally discarded in favor of wrapping (which handles this example and many others gracefully).

Still, not all hope is lost

There has been an effort in the clang and gcc compilers to implement "sanitizers": ways to instrument binaries to detect cases of Undefined Behavior. When using -fsanitize=undefined, signed overflow is detected and aborts the program; very useful during testing.

The Rust programming language has overflow-checking enabled by default in Debug mode (it uses wrapping arithmetic in Release mode for performance reasons).

So, there is growing concern about overflow-checking and the dangers of bogus results going undetected, and hopefully this will in turn spark interest in the research community, compiler community and hardware community.

Matthieu M.
  • 14,567
  • 4
  • 44
  • 65
  • Interestingly, `strcmp` function is rarely implemented using `cmpsb` instruction on x86. It typically uses 32-bit variables to compare 4 characters at once in a loop, until a mismatch is found. – Dmitry Grigoryev May 08 '17 at 17:01
  • More to the point, x86 architecture always had an effective way of detecting overflows: the conveniently named `o` flag which is updated after every arithmetic instruction, and `jo` jump instruction to go with it. – Dmitry Grigoryev May 08 '17 at 17:04
  • 6
    @DmitryGrigoryev that's the opposite of an effective way to check for overflows, for example on Haswell it reduces the throughput from 4 normal additions per cycle to only 1 checked addition, and that's before considering the impact of branch mispredictions of the `jo`'s, and the more global effects of the pollution they add to the branch predictor state and the increased code size. If that flag was sticky it would offer some real potential.. and then you still can't do it properly in vectorized code. –  May 08 '17 at 17:10
  • @DmitryGrigoryev: I was thinking more of [PCMPESTRI](http://www.felixcloutier.com/x86/PCMPESTRI.html). – Matthieu M. May 08 '17 at 17:35
  • 3
    Since you're linking to [a blog post](http://blog.regehr.org/archives/1401) written by John Regehr, I thought it would be appropriate to also link to [another of his article](https://blog.regehr.org/archives/1384), written a few months before the one you linked. These articles talk about different philosophies: In the earlier article, integers are fixed size; integer arithmetic are checked (i.e. the code can't continue its execution); there's either an exception or a trap. The newer article talks about ditching fixed-size integers altogether, which eliminates overflows. – rwong May 08 '17 at 20:05
  • I think x86 had a trap-if-overflow-bit-is-set instruction that was removed with x64. – user253751 May 08 '17 at 21:08
  • @rwong: Oh, I should have re-read the article, I actually linked the wrong one. For posterity, I initially linked http://blog.regehr.org/archives/1401. – Matthieu M. May 09 '17 at 06:33
  • 2
    @rwong Infinite-sized integers have their problems as well. If your overflow is the result of a bug (which it often is), it may turn a quick crash into a prolonged agony that consumes all server resources until everything fails horribly. I'm mostly a fan of the "fail early" approach - less chance of poisoning the whole environment. I'd prefer the Pascal-ish `1..100` types instead - be explicit about the expected ranges, rather than being "forced" into 2^31 etc. Some languages offer this, of course, and they tend to do overflow checking by default (sometimes at compile-time, even). – Luaan May 09 '17 at 10:34
  • 1
    @Luaan: What is interesting is that often times intermediate computations may temporarily overflow, but the result does not. For example, on your 1..100 range, `x * 2 - 2` may overflow when `x` is 51 even though the result fits, forcing you to rearrange your computation (sometimes in unnatural way). In my experience, I've found that I generally prefer to run the computation in a larger type, and then check whether the result fits or not. – Matthieu M. May 09 '17 at 11:32
  • 1
    @MatthieuM. Yeah, that's where you get into the "sufficiently smart compiler" territory. Ideally, a value of 103 should be valid for a 1..100 type as long as it is never used in a context where a true 1..100 is expected (e.g. `x = x * 2 - 2` should work for all `x` where the assignment results in a valid 1..100 number). That is, operations on the numeric type may have a higher precision than the type itself as long as the assignment fits. This would be quite useful in cases like `(a + b) / 2` where ignoring (unsigned) overflows *may* be the correct option. – Luaan May 09 '17 at 11:48
  • If code were required to cast the result of a variable-sized left-shift, a compiler would be able to determine the maximum possible value for any intermediate computations on integer-type or subrange-type operands, and could either generate code that would trap when and only when a result was made to store a value into a container that couldn't hold it, or else demand that a programmer explicitly cast an intermediate computation to a subrange type (e.g. if code computes `x= (a*b*c*d*e*f) / (g*h*i*j*k)` and all variables have type `0..1000000000000000000`, the largest intermediate result... – supercat May 10 '17 at 14:52
  • ...an implementation would have to process would an integer exactly equal to 1.0E108; while it might be possible for an implementation to automatically use a fixed-sized integer type large enough for such values, it might be more practical to require that a programmer explicitly limit intermediate results to something smaller. – supercat May 10 '17 at 14:55
  • The x86 (strcmp and all) predates the invention of the C programming language. You’re seeing “C as a portable assembly language”, not the other way around. – JDługosz May 11 '17 at 09:26
  • @JDługosz: [PCMPISTRI](http://www.felixcloutier.com/x86/PCMPISTRI.html) (and co) are part of SSE4.2. They allow comparing 8-bits or 16-bits NUL-terminated character strings, which is pretty specific to C (most languages have explicit lengths, so would use the PCMPESTRI instruction instead). – Matthieu M. May 11 '17 at 09:39
  • Ok, I thought you meant `rep cmpsb`, the 8086 instruction. – JDługosz May 11 '17 at 11:10
  • 1
    @JDługosz: Yes, you're the second person to assume so, which clearly indicates that my answer was insufficient. I've edited the answer to explicitly state which instruction I had in mind to avoid further confusion :) – Matthieu M. May 11 '17 at 11:12
  • Hmm, the SSE instruction does not perform strcmp. [Here](https://en.wikibooks.org/wiki/X86_Assembly/SSE) is asm code that implements strcmp by invoking pcmpistri in a loop and doing more stuff. The strings must be aligned on 16-byte boundaries. – JDługosz May 11 '17 at 14:19
  • @JDługosz: Yes of course; it only works on fixed-size vectors so it couldn't. However it's THE core instruction used to implement strcmp, the others are trivial (loop setup and exact formatting of result). – Matthieu M. May 11 '17 at 16:55
  • It finds the first char that differs. It doesn't tell you which is larger — a normal instruction does a subtract of the character codes, and *that* is an essential instruction for the definition of strcmp. That is, finding the position where they differ is not the same semantic as determining which argument sorts greater! – JDługosz May 11 '17 at 17:16
10

Languages which attempt to detect overflows have historically defined the associated semantics in ways that severely restricted what would otherwise have been useful optimizations. Among other things, while it will often be useful to perform computations in a different sequence from what is specified in code, most languages that trap overflows guarantee that given code like:

for (int i=0; i<100; i++)
{
  Operation1();
  x+=i;
  Operation2();
}

if the starting value of x would cause an overflow to occur on the 47th pass through the loop, Operation1 will execute 47 times and Operation2 will execute 46. In the absence of such a guarantee, if nothing else within the loop uses x, and nothing will use the value of x following a thrown exception by Operation1 or Operation2, code could be replaced with:

x+=4950;
for (int i=0; i<100; i++)
{
  Operation1();
  Operation2();
}

Unfortunately, performing such optimizations while guaranteeing correct semantics in cases where an overflow would have occurred within the loop is difficult--essentially requiring something like:

if (x < INT_MAX-4950)
{
  x+=4950;
  for (int i=0; i<100; i++)
  {
    Operation1();
    Operation2();
  }
}
else
{
  for (int i=0; i<100; i++)
  {
    Operation1();
    x+=i;
    Operation2();
  }
}

If one considers that a lot of real-world code uses loops that are more involved, it will be obvious that optimizing code while preserving overflow semantics is difficult. Further, because of caching issues, it's entirely possible that the increase in code size would make the overall program run more slowly even though there are fewer operations on the commonly-executed path.

What would be needed to make overflow detection inexpensive would be a defined set of looser overflow-detection semantics which would make it easy for code to report whether a computation was performed without any overflows that might have affected the results(*), but without burdening the compiler with details beyond that. If a language spec were focused on reducing the cost of overflow detection to the bare minimum necessary to achieve the above, it could be made much less costly than it is in existing languages. I'm unaware of any efforts to facilitate efficient overflow detection, however.

(*) If a language promises that all overflows will be reported, then an expression like x*y/y cannot be simplified to x unless x*y can be guaranteed not to overflow. Likewise, even if the result of a computation would be ignored, a language that promises to report all overflows will need to perform it anyway so it can perform the overflow check. Since overflow in such cases cannot result in arithmetically-incorrect behavior, a program would not need to perform such checks to guarantee that no overflows have caused potentially-inaccurate results.

Incidentally, overflows in C are especially bad. Although almost every hardware platform that supports C99 uses two's-complement silent-wraparound semantics, it is fashionable for modern compilers to generate code which may cause arbitrary side-effects in case of overflow. For example, given something like:

#include <stdint.h>
uint32_t test(uint16_t x, uint16_t y) { return x*y & 65535u; }
uint32_t test2(uint16_t q, int *p)
{
  uint32_t total=0;
  q|=32768;
  for (int i = 32768; i<=q; i++)
  {
    total+=test(i,65535);
    *p+=1;
  }
  return total;
}

GCC will generate code for test2 which unconditionally increments (*p) once and returns 32768 regardless of the value passed into q. By its reasoning, the computation of (32769*65535) & 65535u would cause an overflow and there is thus no need for the compiler to consider any cases where (q | 32768) would yield a value larger than 32768. Even though there is no reason that the computation of (32769*65535) & 65535u should care about the upper bits of the result, gcc will use signed overflow as justification for ignoring the loop.

supercat
  • 8,335
  • 22
  • 28
  • 2
    "it is fashionable for modern compilers..." -- similarly, it was briefly fashionable for the developers of certain well-known kernels to choose not to read the documentation regarding the optimisation flags they used, and then act angry all over the internet because they were forced to add even more compiler flags to get the behaviour they wanted ;-). In this case, `-fwrapv` results in defined behaviour, albeit not the behaviour the questioner wants. Granted, gcc optimisation does turn any kind of C development into a thorough exam on the standard and the compiler behaviour. – Steve Jessop May 09 '17 at 09:59
  • 1
    @SteveJessop: C would be a much healthier language if compiler writers recognized a low-level dialect where "undefined behavior" meant "do whatever would make sense on the underlying platform", and then added ways for programmers to waive unnecessary guarantees implied thereby, rather than presuming that the phrase "non-portable or erroneous" in the Standard simply means "erroneous". In many cases, the optimal code that can be obtained in a language with weak behavioral guarantees will be much better than can be obtained with stronger guarantees or no guarantees. For example... – supercat May 09 '17 at 15:18
  • 1
    ...if a programmer needs to evaluate `x+y > z` in a fashion that will never do anything other than yield 0 or yield 1, but either result would be equally acceptable in case of overflow, a compiler which offers that guarantee could often generate better code for the expression `x+y > z` than any compiler would be able to generate for a defensively-written version of the expression. Realistically speaking, what fraction of *useful* overflow-related optimizations would be precluded by a guarantee that integer calculations other than division/remainder will execute with no side-effects? – supercat May 09 '17 at 15:24
  • I confess that I'm not fully into the details, but the fact that your grudge is with "compiler writers" in general, and not specifically "someone on gcc who won't accept my `-fwhatever-makes-sense` patch", strongly suggests to me that there's more to it than whimsy on their part. The usual arguments I've heard are that code inlining (and even macro expansion) benefit from deducing as much as possible about the specific use of a code construct, since either thing commonly results in inserted code that deals with cases it doesn't need to, that the surrounding code "proves" impossible. – Steve Jessop May 09 '17 at 19:16
  • So for a simplified example, if I write `foo(i + INT_MAX + 1)`, compiler-writers are keen to apply optimisations to the inlined `foo()` code which rely for correctness on its argument being non-negative (fiendish divmod tricks, perhaps). Under your additional restrictions, they could only apply optimisations whose behaviour for negative inputs makes sense for the platform. Of course, personally I'd be happy for that to be a `-f` option that switches on `-fwrapv` etc, and likely must disable some optimisations there's no flag for. But it's not like I can be bothered to do all that work myself. – Steve Jessop May 09 '17 at 19:31
  • @SteveJessop: My beef is not just with gcc; clang is in some regards just as bad in some regards. With regard to the example, you're correct with regard for what I would regard as a medium-strong level of integer behavioral guarantees [temporary expressions may behave as though they exceed the range of "int", but other objects of type "int" would always have their values wrapped]. What I'd suggest would be far more useful than the kind of UB-based inference you describe would be a set of "__checked_assume" and "__unchecked_assume" directives. The latter would allow a compiler to... – supercat May 09 '17 at 21:59
  • ...treat as UB any circumstance where evaluation of the argument would not have behavior defined as yielding a non-zero value; the former would invite a compiler to either treat it as an assert or a no-op, deciding in Unspecified fashion between those choices, but would require a compiler to do one or the other. If `x` would be in the range 10..40 for all cases of interest and code that only had to handle such values could be smaller than code that had to uphold the Standard for all values, a `__checked_assume(x>=10 && x <= 40);` would let a compiler generate the simplified code... – supercat May 09 '17 at 22:06
  • ...in exchange for checking that `x` was within the range. It wouldn't have to use UB to justify downstream assumptions that `x` was in range--it would know that `x` was within range because it would have actually tested it. Add in an `__fail_point` directive which would invite a compiler to trap if it could tell that either a downstream `__checked_assume` would inevitably fail or the condition associated with an upstream `__checked_assume` had failed, and compilers could have even more optimization opportunities without any need for UB. – supercat May 09 '17 at 22:09
  • But this all assumes that the programmer wants to hand-write a lot of assertions that may or may not actually help the optimizer. The current situation is that compiler-writers are keen to optimize code even without manual assistance, using whatever information happens to be present in the code. – Steve Jessop May 09 '17 at 22:38
  • @SteveJessop: Indicating expected conditions is useful for documentation whether or not the compiler happens to receive benefit from it. And if what a programmer needs is "yield `x*y>z` if in cases where there's no overflow, or yield an arbitrary 0 or 1 in case of overflow, but don't launch nuclear missiles regardless", it doesn't make sense that the last condition should be the most expensive one to meet. – supercat May 09 '17 at 22:52
  • Submit that patch to gcc then ;-p Or just use `-fwrapv`, which meets your requirements with no *explicit* overhead on most normal architectures, but accept that you're disabling somewhat more optimisation than you strictly need for your stated requirements. Since your requirements are different from those anticipated by compiler-writers, that's not a big surprise. – Steve Jessop May 09 '17 at 23:39
  • @SteveJessop: Judging from mailing-list archives, the authors of gcc are more interested in "optimization" and in proclaiming that code which was designed for use on sane compilers is "broken", than in processing a dialect of C that's really suitable for most of the purposes for which C is used, or even honoring the C89 standard when used in C89 mode. – supercat May 09 '17 at 23:47
  • don't know about C89 mode, but if you have in mind an idea of what a "sane compiler" does, which matches neither the standard nor the compiler you're using, then you're going to lose. If you don't want the compiler-writers to optimize your code for you, at least in this respect, use `-fwrapv`. That's what it's for. Is it crippling the performance of your actual code? – Steve Jessop May 09 '17 at 23:55
  • @SteveJessop: The commercial compilers I've used seem pretty sane. As for C89, it had a useful guarantee about Common Initial Sequences which made it possible to have a function accept a pointer to any structure that had a known CIS and act upon members of that CIS. Nothing in C89 would justify a requirement that CIS members could only be accessed using an lvalue of union type (which would make the CIS guarantee rather useless), but the atuhors of gcc pretend that it does so. – supercat May 10 '17 at 00:00
  • @SteveJessop: BTW, do you think the authors of the Standard were trying to mandate everything necessary to make an implementation usable for any particular purpose, or do you think they expected that usability for many purposes would require things beyond what the Standard required for conformance (read the C89 rationale, 2.2.4.1, starting with "While a deficient implementation...")? Do you think they had any expectations with how commonplace implementations would process e.g. `unsigned x = (unsigned char)2 * INT_MAX;` (see rationale, 3.2.1 pa. 4 "...in even more cases")? – supercat May 10 '17 at 15:25
  • I certainly don't think the authors of the standard intended to forbid implementations from providing optimization options that create counter-intuitive UB. Whether or not implementations should provide a "sane", unoptimized or partly-optimized, mode I think is a bit more open, since I think the authors probably did *anticipate* that implementations would provide that, but (wisely IMO) chose not to mandate it. But, still, what percentage performance loss are you seeing with `-fwrapv` that motivates your chosing not to use it and demanding help from the gcc devs with that choice? – Steve Jessop May 22 '17 at 12:30
  • That last question partly asked in mischief, but actually I am interested to know what practical benefits there are of an arithmetic mode that falls somewhere between `-fwrapv` and "all speed to the engines". – Steve Jessop May 22 '17 at 12:35
  • @SteveJessop: Primarily, it's a feeling that unless a compiler is targeted exclusively toward application fields that process only trusted data, default settings should be compatible with the principles of Annex L Analyzability. Unfortunately, the way Annex L is written makes it almost useless in practice, but the principle is sound. I would suggest that in most application fields, programs have two fundamental requirements: 1. When given valid data, give valid output; 2. When given invalid data, do not launch nuclear missiles. A quality compiler suitable for use in such fields... – supercat May 22 '17 at 14:32
  • ...should allow #2 to be met easily. If moderately-aggressively optimized code which satisfied #1 would "naturally" meet #2, but more aggressive optimizations would make it necessary for a programmer to either disable optimizations entirely or write extra code to meet #2, I would suggest that except in application fields where #2 isn't a requirement the more aggressive optimizations would make the compiler more dangerous and less useful. IMHO, Annex L is over-complicated and the fundamental principle could have been expressed much more simply: treat Non-Critical UB as IDB, and replace... – supercat May 22 '17 at 14:42
  • ...everything that talks about the proposed signalling implementation with a list of macros that would indicate what an implementation can or cannot guarantee in each case, making clear that allowable IDB would include choosing in Unspecified fashion from a list of possible behaviors (e.g. an implementation might guarantee that given e.g. `long long LL1=i1*i2;`, it might arbitrarily behave as `LL1=(long long)i1*i2`, `LL1=(int)(1u*i1*u2);`, or if `i1` and `i2` are positive, `LL1=1u*i1*i2;`). Code that could wouldn't care which result was used wouldn't have to force a compiler's choice. – supercat May 22 '17 at 14:49
9

Not all programming languages ignore integer overflows. Some languages provide safe integer operations for all numbers (most Lisp dialects, Ruby, Smalltalk,...) and others via libraries - for instance there are various BigInt classes for C++.

Whether a language makes integer safe from overflow by default or not depends on its purpose: system languages like C and C++ need to provide zero cost abstractions and "big integer" is not one. Productivity languages, such as Ruby, can and do provide big integers out of the box. Languages such as Java and C# that are somewhere in between should IMHO go with the safe integers out of the box, by they don't.

Nemanja Trifunovic
  • 6,815
  • 1
  • 26
  • 34
  • Note that there is a difference between detecting overflow (and then have a signal, panic, exception, ...) and switching to big nums. The former should be doable much more cheaply than the latter. – Matthieu M. May 09 '17 at 06:34
  • @MatthieuM. Absolutely - and I realize I am not clear about that in my answer. – Nemanja Trifunovic May 09 '17 at 12:19
7

As you have shown, C# would have been 3 times slower if it had overflow checks enabled by default (assuming your example is a typical application for that language). I agree that performance is not always the most important feature, but languages / compilers are typically compared on their performance in typical tasks. This is in part due to the fact that the quality of language features is somewhat subjective, while a performance test is objective.

If you were to introduce a new language which is similar to C# in most aspects but 3 times slower, getting a market share wouldn't be easy, even if in the end most of your end users would benefit from overflow checks more than they would from higher performance.

Dmitry Grigoryev
  • 494
  • 2
  • 13
  • 10
    This was particularly the case for C#, which was in its early days compared to Java and C++ not on developer productivity metrics, which are hard to measure, or on cash-saved-from-not-dealing-with-security-breaches metrics, which are hard to measure, but on trivial performance benchmarks. – Eric Lippert May 08 '17 at 19:41
  • 1
    And likely, CPU's performance is checked with some simple number crunching. Thus optimizations for overflow detection may yield "bad" results on those tests. Catch22. – Bernhard Hiller May 11 '17 at 07:39
5

Beyond the many answers that justify lack of overflow checking based on performance, there are two different kinds of arithmetic to consider:

  1. indexing calculations (array indexing and/or pointer arithmetic)

  2. other arithmetic

If the language uses an integer size that is the same as the pointer size, then a well constructed program will not overflow doing indexing calculations because it would necessarily have to run out of memory before the indexing calculations would cause overflow.

Thus, checking memory allocations is sufficient when working with pointer arithmetic and indexing expressions involving allocated data structures. For example, if you have a 32-bit address space, and use 32-bit integers, and allow a maximum of 2GB of heap to allocated (about half the address space), indexing/pointer calculations (basically) will not overflow.

Further, you might be surprised as to how much of addition/subtraction/multiplication involves array indexing or pointer calculation, thus falling into the first category. Object pointer, field access, and array manipulations are indexing operations, and many programs do no more arithmetic computation than these! Essentially, this the primary reason that programs work as well as they do without integer overflow checking.

All non-indexing and non-pointer computations should be classified as either those that want/expect overflow (e.g. hashing computations), and those that don't (e.g. your summation example).

In the latter case, programmers will often use alternative data types, such as double or some BigInt. Many calculations require a decimal data type rather than double, e.g. financial calculations. If they don't and stick with integer types, then they need to take care to check for integer overflow -- or else, yes, the program can reach an undetected error condition as you're pointing out.

As programmers, we need to be sensitive to our choices in numeric data types and the consequences of them in terms of the possibilities for overflow, not to mention precision. In general (and especially when working with the C family of languages with the desire to use the fast integer types) we need to be sensitive to and aware of the differences between indexing calculations vs. others.

Erik Eidt
  • 33,282
  • 5
  • 57
  • 91
4

In Swift, any integer overflows are detected by default and instantly stop the program. In cases where you need wraparound behaviour, there are different operators &+, &- and &* that achieve that. And there are functions that perform an operation and tell whether there was an overflow or not.

It's fun to watch beginners try to evaluate the Collatz sequence and have their code crash :-)

Now the designers of Swift are also the designers of LLVM and Clang, so they know a bit or two about optimisation, and are quite capable of avoiding unnecessary overflow checks. With all optimisations enabled, overflow checking doesn't add much to code size and execution time. And since most overflows lead to absolutely incorrect results, it's code size and execution time well spent.

PS. In C, C++, Objective-C signed integer arithmetic overflow is undefined behaviour. That means whatever the compiler does in the case of signed integer overflow is correct, by definition. Typical ways to cope with signed integer overflow is to ignore it, taking whatever result the CPU gives you, building assumptions into the compiler that such overflow will never happen (and conclude for example that n+1 > n is always true, since overflow is assumed to never happen), and a possibility that is rarely used is to check and crash if overflow happens, like Swift does.

gnasher729
  • 42,090
  • 4
  • 59
  • 119
  • 2
    I've sometimes wondered if the people who are pushing UB-driven insanity in C were secretly trying to undermine it in favor of some other language. That would make sense. – supercat May 09 '17 at 23:37
  • Treating `x+1>x` as unconditionally true would not require a compiler to make any "assumptions" about x if a compiler is allowed to evaluate integer expressions using arbitrary larger types as convenient (or behave as though it's doing so). A nastier example of overflow-based "assumptions" would be deciding that given `uint32_t mul(uint16_t x, uint16_t y) { return x*y & 65535u; }` a compiler can use `sum += mul(65535, x)` to decide that `x` cannot be greater than 32768 [behavior that would likely shock the people who wrote the C89 Rationale, which suggests that one of the deciding factors... – supercat May 11 '17 at 19:47
  • ...in making `unsigned short` promote to `signed int` was the fact that two's-complement silent-wraparound implementations (i.e. the majority of C implementations then in use) would treat code like the above the same way whether `unsigned short` promoted to `int` or `unsigned`. The Standard didn't *require* implementations on silent-wraparound two's-complement hardware to treat code like the above sanely, but the authors of the Standard seem to have expected that they'd do so anyhow. – supercat May 11 '17 at 19:50
3

The language Rust provides an interesting compromise between checking for overflows and not, by adding the checks for the debugging build and removing them in the optimized release version. This allows you to find the bugs during testing, while still getting full performance in the final version.

Because the overflow wraparound is sometimes wanted behaviour, there are also versions of the operators that never checks for overflow.

You can read more about the reasoning behind the choice in the RFC for the change. There is also plenty of interesting information in this blog post, including a list of bugs that this feature has helped with catching.

Hjulle
  • 139
  • 4
  • 2
    Rust also provides methods like `checked_mul`, that checks whether overflow has occurred and returns `None` if so, `Some` otherwise. This can be used in production as well as debug mode: https://doc.rust-lang.org/std/primitive.i32.html#examples-15 – Akavall May 09 '17 at 00:39
2

Actually, the real cause for this is purely technical/historical: CPU's ignore sign for the most part. There generally is only a single instruction to add two integers in registers, and the CPU does not care a bit whether you interpret these two integers as signed or unsigned. The same goes for subtraction, and even for multiplication. The only arithmetic operation that needs to be sign-aware is the division.

The reason why this works, is the 2's complement representation of signed integers that is used by virtually all CPUs. For instance, in 4-bit 2's complement the addition of 5 and -3 looks like this:

  0101   (5)
  1101   (-3)
(11010)  (carry)
  ----
  0010   (2)

Observe how the wrap-around behavior of throwing away the carry-out bit yields the correct signed result. Likewise, CPUs usually implement the subtraction x - y as x + ~y + 1:

  0101   (5)
  1100   (~3, binary negation!)
(11011)  (carry, we carry in a 1 bit!)
  ----
  0010   (2)

This implements subtraction as an addition in hardware, tweaking only the inputs to the arithmetical-logical-unit (ALU) in trivial ways. What could be simpler?

Since multiplication is nothing else than a sequence of additions, it behaves in a similarly nice way. The result of using 2's complement representation and ignoring the carry out of arithmetic operations is simplified circuitry, and simplified instruction sets.

Obviously, since C was designed to work close to the metal, it adopted this exact same behavior as the standardized behavior of unsigned arithmetic, allowing only signed arithmetic to yield undefined behavior. And that choice carried over to other languages like Java, and, obviously, C#.

  • I came here to give this answer as well. – Mr Lister May 11 '17 at 20:39
  • Unfortunately, some people seem to regard as grossly unreasonable the notion that people writing low-level C code on a platform should have the audacity to expect that a C compiler suitable for such purpose would behave in constrained fashion in case of overflow. Personally, I think it reasonable for a compiler to behave as though computations are performed using arbitrarily-extended precision at the compiler's convenience (so on a 32-bit system, if `x==INT_MAX`, then `x+1` might arbitrarily behave as either +2147483648 or -2147483648 at the compiler's convenience), but... – supercat May 11 '17 at 21:56
  • some people seem to think that if `x` and `y` are `uint16_t` and code on a 32-bit system computes `x*y & 65535u` when `y` is 65535, a compiler should assume that code will never be reached when `x` is greater than 32768. – supercat May 11 '17 at 21:59
2

Some answers have discussed the cost of checking, and you've edited your answer to dispute that this is a reasonable justification. I'll try to address those points.

In C and C++ (as examples), one of the languages design principles is not to provide functionality that wasn't asked for. This is commonly summed up by the phrase "don't pay for what you don't use". If the programmer wants overflow checking then s/he can ask for it (and pay the penalty). This makes the language more dangerous to use, but you choose to work with the language knowing that, so you accept the risk. If you don't want that risk, or if you are writing code where safety is of paramount performance, then you can select a more appropriate language where the performance/risk tradeoff is different.

But with the 10,000,000,000 repetitions, the time taken by a check is still less than 1 nanosecond.

There are a few things wrong with this reasoning:

  1. This is environment specific. It generally makes very little sense to quote specific figures like this, because code is written for all sorts of environments that vary by orders of magnitude in terms of their performance. Your 1 nanosecond on a (I assume) desktop machine might seem amazingly fast to someone coding for an embedded environment, and unbearably slow to someone coding for a super computer cluster.

  2. 1 nanosecond might seem like nothing for a segment of code that runs infrequently. On the other hand, if it is in an inner loop of some calculation that is the main function of the code, then every single fraction of time you can shave off can make a big difference. If you're running a simulation on a cluster then those saved fractions of a nanosecond in your inner loop can translate directly to money spent on hardware and electricity.

  3. For some algorithms and contexts, 10,000,000,000 iterations could be insignificant. Again, it doesn't generally make sense to talk about specific scenarios that only apply in certain contexts.

There may be situation where that is important, but for most applications, that won't matter.

Perhaps you are right. But again, this is a matter of what the goals of a particular language are. Many languages are in fact designed to accommodate the needs of "most" or to favour safety over other concerns. Others, like C and C++, prioritise on efficiency. In that context, making everyone pay a performance penalty simply because most people won't be bothered, goes against what the language is trying to achieve.

Jon Bentley
  • 335
  • 3
  • 10
-1

There are good answers, but I think there's a missed point here: the effects of an integer overflow aren't necessarily a bad thing, and after-the-fact it's difficult to know whether i went from being MAX_INT to being MIN_INT was due to an overflow problem or if it was intentionally done by multiplying by -1.

For example, if I want to add all the representable integers greater than 0 together, I'd just use a for(i=0;i>=0;++i){...} addition loop- and when it overflows it stops the addition, which is the goal behavior (throwing an error would mean I have to circumvent an arbitrary protection because it's interfering with standard arithmetic). It's bad practice to restrict primitive arithmetics, because:

  • They're used in everything- a slowdown in primitive maths is a slowdown in every functioning program
  • If a programmer needs them, they can always add them
  • If you have them and the programmer doesn't need them (but does need faster runtimes), they can't remove them easily for optimization
  • If you have them and the programmer needs them to not be there (like in the example above), the programmer is both taking the run-time hit (which may or may not be relevant), and the programmer still needs to invest time removing or working around the 'protection'.
Delioth
  • 433
  • 4
  • 8
  • 3
    It's not really possible for a programmer to add efficient overflow checking if a language doesn't provide for it. If a function computes a value that is ignored, a compiler can optimize out the computation. If a function computes a value which is overflow-checked but *otherwise* ignored, a compiler must perform the computation and trap if it overflows, even if an overflow would otherwise not affect the program's output and could be safely ignored. – supercat May 09 '17 at 00:23
  • 1
    You can't go from `INT_MAX` to `INT_MIN` by multiplying by -1. – David Conrad May 09 '17 at 08:34
  • The solution is obviously to provide a way for the programmer to turn the checks off in a given block of code or compilation unit. – David Conrad May 09 '17 at 08:34
  • `for(i=0;i>=0;++i){...}` is the style of code I try to discourage in my team: it relies on special effects / side effects and does not express clearly what it is meant to do. But still I appreciate your answer as it shows a different programming paradigm. – Bernhard Hiller May 10 '17 at 08:03
  • @BernhardHiller Oh, definitely this wouldn't be a good way of doing the addition loop, it's just an implementation that's guaranteed to find the highest int, no matter what kind it is (it will loop until `i` overflows, rather than a preset bound). – Delioth May 10 '17 at 14:24
  • 1
    @Delioth: If `i` is a 64-bit type, even on an implementation with consistent silent-wraparound two's-complement behavior, running a billion iterations per second, such a loop could only be guaranteed to find the largest `int` value if it's allowed to run for hundreds of years. On systems which don't promise consistent silent-wraparound behavior, such behaviors would not be guaranteed no matter how long code is given. – supercat May 10 '17 at 15:02