65

I recently started working at a place with some much older developers (around 50+ years old). They have worked on critical applications dealing with aviation where the system could not go down. As a result the older programmer tends to code this way.

He tends to put a boolean in the objects to indicate if an exception should be thrown or not.

Example

public class AreaCalculator
{
    AreaCalculator(bool shouldThrowExceptions) { ... }
    CalculateArea(int x, int y)
    {
        if(x < 0 || y < 0)
        {
            if(shouldThrowExceptions) 
                throwException;
            else
                return 0;
        }
    }
}

(In our project the method can fail because we are trying to use a network device that can not be present at the time. The area example is just an example of the exception flag)

To me this seems like a code smell. Writing unit tests becomes slightly more complex since you have to test for the exception flag each time. Also, if something goes wrong, wouldn't you want to know right away? Shouldn't it be the caller's responsibility to determine how to continue?

His logic/reasoning is that our program needs to do 1 thing, show data to user. Any other exception that doesn't stop us from doing so should be ignored. I agree they shouldn't be ignored, but should bubble up and be handled by the appropriate person, and not have to deal with flags for that.

Is this a good way of handling exceptions?

Edit: Just to give more context over the design decision, I suspect that it is because if this component fails, the program can still operate and do its main task. Thus we wouldn't want to throw an exception (and not handle it?) and have it take down the program when for the user its working fine

Edit 2: To give even more context, in our case the method is called to reset a network card. The issue arises when the network card is disconnected and reconnected, it is assigned a different ip address, thus Reset will throw an exception because we would be trying to reset the hardware with the old ip.

Nicolas
  • 707
  • 1
  • 5
  • 12
  • 22
    c# has a convention for this Try-Parse patter. more info: https://docs.microsoft.com/en-us/dotnet/standard/design-guidelines/exceptions-and-performance#try-parse-pattern The flag is not match with this pattern. – Peter Feb 05 '19 at 22:44
  • 18
    This is basically a control parameter and it changes how the methods internals execute. This is bad, regardless of the scenario. https://www.martinfowler.com/bliki/FlagArgument.html, https://softwareengineering.stackexchange.com/questions/147977/is-it-wrong-to-use-a-boolean-parameter-to-determine-behavior, https://medium.com/@amlcurran/clean-code-the-curse-of-a-boolean-parameter-c237a830b7a3 – bic Feb 06 '19 at 10:56
  • There are some examples from the .net framework that use this pattern, specifically `Type.GetType` has an overload with this kind of flag. See: https://docs.microsoft.com/en-us/dotnet/api/system.type.gettype?view=netframework-4.7.2#System_Type_GetType_System_String_System_Boolean_ – Bradley Uffner Feb 06 '19 at 13:17
  • 1
    In addition to the Try-Parse comment from Peter, here is a nice article about Vexing exceptions: https://blogs.msdn.microsoft.com/ericlippert/2008/09/10/vexing-exceptions/ – Linaith Feb 06 '19 at 15:17
  • 2
    "Thus we wouldn't want to throw an exception (and not handle it?) and have it take down the program when for the user its working fine" - you know you can catch exceptions right? – user253751 Feb 06 '19 at 22:57
  • 1
    I'm pretty sure this has been covered before somewhere else but given the simple Area example I'd be more inclined to wonder where those negative numbers would be coming from and whether you might be able to handle that error condition somewhere else (e.g., whatever was reading the file containing length and width, for example); However, the "we are trying to use a network device that can not be present at the time." point could merit a totally different answer, is this a third party API or something industry standard like TCP/UDP? – jrh Feb 07 '19 at 00:44
  • 1
    "I suspect that it is because if this component fails, the program can still operate and do its main task." It's an admirable idea in theory, occasionally it can help (a couple of times I got bitten by buggy framework classes that threw undocumented exceptions that amounted to nothing more than "resource temporarily unavailable"). It's not a sure thing though; e.g., if your program responds to (crucial) requests then updates a (non-crucial) UI, misbehaving UI objects may just end up throwing an `AccessViolationException` (legitimate or not) and end up crashing your program anyway. – jrh Feb 07 '19 at 00:51
  • 1
    @jrh This is a microsoft api used to reset a network card – Nicolas Feb 07 '19 at 16:08
  • In my opinion, while sentinel variables might seem like a good design pattern for embedded systems, they're ill suited to an OOP language that includes exception handling natively. Wouldn't it just make more sense for the caller to catch these exceptions and handle them appropriately? This approach is just circumventing this process, especially since you're probably checking for these error conditions on the caller-side as well. – ajxs Feb 08 '19 at 03:53
  • I'm really confused, how does a failed network card affect how the area of something is calculated? – JacquesB Feb 08 '19 at 16:03
  • 1
    @JacquesB I agree; unfortunately the OP's example is, in my opinion, far too simple, which makes it very hard for me to give advice (which is why I didn't answer this). Interfacing with hardware from a high level OS like Windows can indeed lead to some.... well..... interesting designs to say the least. It would not surprise me at all if code that attempted to work with very low level components (either using vendor specific code or the mess that is WMI) would require some compromises and ugly guesswork that make the `Area` example almost completely unrelated to the task at hand. – jrh Feb 08 '19 at 16:44
  • ... for example even attempting to *retrieve information* that you think would be standard, like identifying what slot a card is plugged into, or something like a serial number of a card can be an exercise in morphing standards, navigating through placeholder / dummy vendor information, filtering out blank or false information, or just having function calls outright fail unpredictably because you asked for a `SerialNumber` when the vendor only lists a `Serial Number`, or because it's now listed as `Tim's Real Cool Network Card (TM)` instead of `Tim's Network Card` – jrh Feb 08 '19 at 16:46
  • 1
    @Nicolas: If I understand correctly, x or y is negative if the network card fails to reset? I assume negative values will indicate an error code? In that case the issue is not in the code at hand, but in the code which passes the error codes as if it was legitimate values. – JacquesB Feb 08 '19 at 17:28
  • 1
    @JacquesB Sorry the example I gave was perhaps too simple. I meant that as an example of having the flag to indicate if it should throw exceptions. In my case it returns a boolean: True if it was able to reset the card. False if it couldnt. – Nicolas Feb 08 '19 at 18:48
  • @Nicolas Ah... well... APIs that return true or false on failure aren't really all that uncommon (that's similar in design to APIs that return error codes, and pretty much the whole computing world is still built on APIs that return error codes; e.g., winapi and a lot of unix family functions). I think some of the uh... strong responses, to this post, were because returning 0 area kinda sounds like returning a default value for something on calculation failure, which can be bad. Returning status info is a lot different than doing `int divide(x,y){ if(x==0) return -1 else return y/x}` – jrh Feb 08 '19 at 19:17
  • @Nicolas: So I assume the exception will be thrown instead of returning 'false' if the `shouldThrowExceptions` flag is set? Under what circumstances will the flag be set, then? – JacquesB Feb 09 '19 at 17:39
  • 1
    Note that fault tolerance is _far more complex_ than just "does not throw an exception". In particular, the big concern is whether a resource is still in a valid state; throwing exceptions is usually a form of early return, usually via some different flows, so the novice/uncareful programmer can leave programs in an unstable state much easier. It's not cleaning up after yourself - which can happen even in code that throws no exceptions - that's the primary problem, because potentially any referenceable, mutable object might be compromised, which may comprise the entire program. – Clockwork-Muse Feb 09 '19 at 18:22
  • 1
    My two cents: look into Option and Result types (F# ones, but available or easy implementable also in C#) to build an exception-free API (with Result, iyou can throw the unthrown exception if you really want). – Astrinus Feb 11 '19 at 07:53
  • @Astrinus sounds like a better solution. I'll look into it. Thank you. – Nicolas Feb 11 '19 at 15:15
  • @Nicolas: You still have not indicated under what circumstances the flag would be set. This is critical for understanding you question. At best people are just guessing at what the intent of the code is. – JacquesB Feb 13 '19 at 12:39
  • 1
    @JacquesB I am not sure under what circumstances the flag would be set. The flag is for a new portion of code which isnt being called from anywhere yet, and the developer has yet to answer my questions for the code review – Nicolas Feb 13 '19 at 17:58
  • 1
    @Nicolas: Fair enough, but without understanding the intention and purpose of the code, it is impossible to answer. E.g. since the flag is for new code, it might a backwards-compatible migration of en exiting exception-less system into an exception-based system. This *might* be a totally sensible approach depending on the constraints of the existing system.If your own organization cannot explain the context of the code, certainly random people on the internet will not be able to either. (Most of the answers here just ignore the flag, since they don't understand the purpose either.) – JacquesB Feb 14 '19 at 13:23

12 Answers12

73

The problem with this approach is that while exceptions never get thrown (and thus, the application never crashes due to uncaught exceptions), the results returned are not necessarily correct, and the user may never know that there is a problem with the data (or what that problem is and how to correct it).

In order for the results to be correct and meaningful, the calling method has to check the result for special numbers - i.e., specific return values used to denote problems that came up while executing the method. Negative (or zero) numbers being returned for positive-definite quantities (like area) are a prime example of this in older code. If the calling method doesn't know (or forgets!) to check for these special numbers, though, processing can continue without ever realizing a mistake. Data then gets displayed to the user showing an area of 0, which the user knows is incorrect, but they have no indication of what went wrong, where, or why. They then wonder if any of the other values are wrong...

If the exception was thrown, processing would stop, the error would (ideally) be logged and the user may be notified in some way. The user can then fix whatever is wrong and try again. Proper exception handling (and testing!) will ensure that critical applications do not crash or otherwise wind up in an invalid state.

mmathis
  • 5,398
  • 23
  • 33
  • 1
    @Quirk It's impressive how Chen managed to violate the Single Responsibility Principle in only 3 or 4 lines. That's the real problem. Plus, the problem he's talking about (the programmer failing to think about the consequences of errors on each line) is *always* a possibility with unchecked exceptions and only *sometimes* an possibility with checked exceptions. I think I've seen all the arguments ever made against checked exceptions, and not a single one of them is valid. – StackOverthrow Feb 08 '19 at 20:06
  • @TKK personally there are some cases I've ran into where I would have really liked checked exceptions in .NET. It would be nice if there were some high end static analysis tools that could make sure what an API documents as its thrown exceptions is accurate, though that would likely be pretty much impossible, especially when accessing native resources. – jrh Feb 11 '19 at 19:00
  • 1
    @jrh Yes, it would be nice if something kludged some exception safety into .NET, similar to how TypeScript kludges type safety into JS. – StackOverthrow Feb 11 '19 at 19:05
47

Is this a good way of handling exceptions?

No, I think this is pretty bad practice.  Throwing an exception vs. returning a value is a fundamental change in the API, changing the method's signature, and making the method behave quite differently from an interface perspective.

In general, when we design classes and their APIs, we should consider that

  1. there may be multiple instances of the class with different configurations floating around in the same program at the same time, and,

  2. due to dependency injection and any number of other programming practices, one consuming client may create the objects and hand them to another other use them — so often we have a separation between object creators and object users.

Consider now what the method caller has to do to make use of an instance s/he's been handed, e.g. for calling the calculating method: the caller would have to both check for area being zero as well as catch exceptions — ouch!  Testing considerations go not just to the class itself, but to callers' error handling as well...

We should always be making things as easy as possible for the consuming client; this boolean configuration in the constructor that changes the API of an instance method is the opposite of making the consuming client programmer (maybe you or your colleague) fall into the pit of success.

To offer both APIs, you're much better and more normal to either provide two different classes — one that always throws on error and one that always returns 0 on error, or, provide two different methods with a single class.  This way the consuming client can easily know exactly how to check for and handle errors.

Using two different classes or two different methods, you can use the IDEs find method users and refactor features, etc.. much more easily since the two use cases are no longer conflated.  Code reading, writing, maintenance, reviews, and testing is simpler as well.


On another note, I personally feel that we should not take boolean configuration parameters, where the actual callers all simply pass a constant.  Such configuration parameterization conflates two separate use cases for no real benefit.

Take a look at your code base and see if a variable (or non-constant expression) is ever used for the boolean configuration parameter in the constructor!  I doubt it.


And further consideration is to ask why computing the area can fail.  Best might be to throw in the constructor, if the calculation cannot be made.  However, if you don't know whether the calculation can be made until the object is further initialized, then perhaps consider using different classes to differentiate those states (not ready to compute area vs. ready to compute area).

I read that your failure situation is oriented toward remoting, so may not apply; just some food for thought.


Shouldn't it be the caller's responsibility to determine how to continue?

Yes, I agree.  It seems premature for the callee to decide that area of 0 is the right answer under error conditions (especially since 0 is a valid area so no way to tell the difference between error and actual 0, though may not apply to your app).

Erik Eidt
  • 33,282
  • 5
  • 57
  • 91
  • You don't really have to check for the exception at all because you have to check the arguments *before* calling the method. Checking the result against zero doesn't distinguish between the legal arguments 0, 0 and illegal negative ones. The API is really horrible IMHO. – BlackJack Feb 06 '19 at 15:45
  • The Annex K MS pushs for C99 and C++ iostreams are examples of APIs where a hook or flag radically changes the reaction to failures. – Deduplicator Feb 06 '19 at 18:20
37

They have worked on critical applications dealing with aviation where the system could not go down. As a result ...

That is an interesting introduction, which gives me the impression the motivation behind this design is to avoid throwing exceptions in some contexts "because the system could go down" then. But if the system "can go down because of an exception", this is a clear indication that

  • exceptions are not handled properly , at least nor rigidly.

So if the program which uses the AreaCalculator is buggy, you colleague prefers not to have the program "crash early", but to return some wrong value (hoping no one notices it, or hoping noone does something important with it). That is actually masking an error, and in my experience it will sooner or later lead to follow-up bugs for which it becomes difficult to find the root cause.

IMHO writing a program that does not crash under any circumstances but shows wrong data or calculation results is usually in no way better than letting the program crash. The only right approach is to give the caller a chance to notice the error, deal with it, let him decide if the user has to be informed about the wrong behaviour, and if it is safe to continue any processing, or if it is safer to stop the program completely. Thus, I would recommend one of the following:

  • make it hard to overlook the fact a function can throw an exception. Documentation and coding standards are your friend here, and regular code reviews should support the correct usage of components and proper exception handling.

  • train the team to expect and deal with exceptions when they use "black box" components, and have the global behaviour of the program in mind.

  • if for some reasons you think you cannot get the calling code (or the devs who write it) to use exception handling properly, then, as a last resort, you could design an API with explicit error output variables and no exceptions at all, like

    CalculateArea(int x, int y, out ErrorCode err)
    

    so it gets really hard for the caller to to overlook the function could fail. But this is IMHO extremely ugly in C#; it is an old defensive programming technique from C where there are no exceptions, and it should normally not be necessary to work that nowadays.

Doc Brown
  • 199,015
  • 33
  • 367
  • 565
  • 3
    "writing a program that does not crash under any circumstances but shows wrong data or calculation results is usually in no way better than letting the program crash" I fully agree in general although I could imagine that in aviation I would probably prefer to have the plane still going with the instruments showing wrong values in comparison to have a shutdown of the airplane computer. For all less critical applications it's definitely better to not mask errors. – NoDataDumpNoContribution Feb 06 '19 at 11:35
  • 18
    @Trilarion: if a program for flight computer does not contain proper exception handling, "fixing this" by making components not throw exceptions is a very misguided approach. If the program crashes, there should be some redundant backup system which can take over. If the program does not crash and shows a wrong height, for example, the pilots may think "everything is fine" whilst the airplane rushes into the next mountain. – Doc Brown Feb 06 '19 at 11:56
  • "...making components not throw exceptions is a very misguided approach" I guess one could kind of empirically investigate the misguidedness by analyzing historic air plane crashes and the software which was used. It may give complex results though. If I'm in an airplane that's about to crash it may not help me much to say that there should be some redundant backup system if there isn't. – NoDataDumpNoContribution Feb 06 '19 at 12:34
  • 7
    @Trilarion: if the flight computer shows a wrong height and the airplane will crash because of this, it won't help you either (especially when the backup system is there and does not get informed it needs to take over). Backup systems for airplane computers is not a new idea, google for "airplane computer backup systems", I am pretty sure engineers all over the world always built redundant systems into any real life-crititical system (and if it were just because of not loosing insurance). – Doc Brown Feb 06 '19 at 13:40
  • 4
    **This.** If you can't afford for the program to crash, you can't afford to have it silently give wrong answers either. The correct answer is to have appropriate exception handling in *all* cases. For a website, that means a global handler that converts unexpected errors into 500. You might also have additional handlers for more specific situations, like having a `try`/`catch` inside a loop if you need processing to continue if one element fails. – jpmc26 Feb 07 '19 at 18:03
  • 2
    Getting the wrong result is always the worst kind of fail; that reminds me of the rule about optimisation: "get it correct before you optimize, because getting the wrong answer faster still benefits nobody." – Toby Speight Feb 08 '19 at 14:55
  • @DocBrown I'd call a redundant backup system on a plane "a pilot". Although, it depends on what the system is doing. A malfunctioning autopilot that merrily leads the entire plane to its doom is *bad* and handing over control to the pilots is to be expected, for example. – VLAZ Feb 09 '19 at 14:03
  • @VLAZ: it is common for modern airliners today to have [fly-by-wire](https://en.wikipedia.org/wiki/Fly-by-wire) systems. The Wikipedia article also mentions their safety / redundancy measures. A pilot is just another "redundant system", if you like, but no full replacement for other such systems. – Doc Brown Feb 09 '19 at 14:35
  • 1
    I have a bit of a disagreement here. The problem comes down to what is the cost of the program going down? The first launch of the Airane 5 launch comes to mind. The cost of going down was the destruction of the booster. The cost of returning a wrong answer would have been zero. (The program that crashed had actually become moot the instant the engines were lit.) You only throw the exception if the situation permits dealing with the result. If you're in a non-abortable situation and have no human in a position to deal with the crash, don't crash no matter what. – Loren Pechtel Feb 10 '19 at 03:04
  • @LorenPechtel: reading a little bit about [the history if this bug](https://blog.bugsnag.com/bug-day-ariane-5-disaster/), it seems this was just coincidentally, and the behaviour could have been also the opposite. Note the article mentions a backup system, which was also faulty. – Doc Brown Feb 10 '19 at 06:53
  • .. moreover, in that system, the article mentions this "BH" variable which *"was mistakenly interpreted as actual flight data"* - so instead of signalling this failure at the place where it occured immediately, the system took a wrong calculation result for granted, which finally lead to the destruction of the Space Shuttle. – Doc Brown Feb 10 '19 at 07:02
  • 1
    @DocBrown The backup system ran the same code as the original and thus suffered the same crash when faced with an out of range sensor value causing a division by zero. – Loren Pechtel Feb 10 '19 at 15:46
  • @LorenPechtel: not sure about the "division by zero" error (there is nothing mentioned like this in the article I linked above, so if you have a better reference, why don't you share it?), but a backup system with duplicate software can fail by using a wrong calculation result or by not catching an exception - avoiding exceptions does in no way improve the situation. – Doc Brown Feb 10 '19 at 15:57
  • There is potentially a sightly different set of rules here as for your standard x86 8GB ram affair. If exception handling allocates memory etc, that maybe fine in most cases but not always. I don't think you are wrong that a system can raise exceptions and be safe, but we are assuming a lot here. I would argue there _might_ be good reason for this sort of thing. – drjpizzle Feb 20 '19 at 23:55
  • To weigh in on the ariane 5 launch failure, if I understand correctly, the wrapping in the relevant value was expected. I don't think its fair to say it wasn't used or there would have been no implications of the failure had the wrap not triggered an exception that was not, but the system was expected to cope with this as it was not the 'main' calculation of this quantity. – drjpizzle Feb 21 '19 at 00:09
13

Writing unit tests becomes slightly more complex since you have to test for the exception flag each time.

Any function with n parameters is going to be harder to test than one with n-1 parameters. Extend that out to the absurd and the argument becomes that functions shouldn't have parameters at all because that makes them easiest to test.

While it's a great idea to write code that's easy to test, it's a terrible idea to put test simplicity above writing code useful to the people who have to call it. If the example in the question has a switch that determines whether or not an exception is thrown, it's possible that the number callers wanting that behavior merited adding it to the function. Where the line between complex and too complex lies is a judgment call; anyone trying to tell you there's a bright line that applies in all situations should be eyed with suspicion.

Also, if something goes wrong, wouldn't you want to know right away?

That depends on your definition of wrong. The example in the question defines wrong as "given a dimension of less than zero and shouldThrowExceptions is true." Being given a dimension if less than zero isn't wrong when shouldThrowExceptions is false because the switch induces different behavior. That is, quite simply, not an exceptional situation.

The real problem here is that the switch was poorly named because it isn't descriptive of what it makes the function do. Had it been given a better name like treatInvalidDimensionsAsZero, would you have asked this question?

Shouldn't it be the caller's responsibility to determine how to continue?

The caller does determine how to continue. In this case, it does so ahead of time by setting or clearing shouldThrowExceptions and the function behaves according to its state.

The example is a pathologically-simple one because it does a single calculation and returns. If you make it slightly more complex, such as calculating the sum of the square roots of a list of numbers, throwing exceptions can give callers problems they can't resolve. If I pass in a list of [5, 6, -1, 8, 12] and the function throws an exception over the -1, I have no way to tell the function to keep going because it will have already aborted and thrown away the sum. If the list is an enormous data set, generating a copy without any negative numbers before calling the function might be impractical, so I'm forced to say in advance how invalid numbers should be treated, either in the form of a "just ignore them" switch or maybe provide a lambda that gets called to make that decision.

His logic/reasoning is that our program needs to do 1 thing, show data to user. Any other exception that doesn't stop us from doing so should be ignored. I agree they shouldn't be ignored, but should bubble up and be handled by the appropriate person, and not have to deal with flags for that.

Again, there is no one-size-fits-all solution. In the example, the function was, presumably, written to a spec that says how to deal with negative dimensions. The last thing you want to be doing is lowering the signal-to-noise ratio of your logs by filling them with messages that say "normally, an exception would be thrown here, but the caller said not to bother."

And as one of those much-older programmers, I would ask that you kindly depart from my lawn. ;-)

Blrfl
  • 20,235
  • 2
  • 49
  • 75
  • I agree that naming and intent are extremely important and in this case proper parameter name can really turn the tables, so +1, but 1. `our program needs to do 1 thing, show data to user. Any other exception that doesn't stop us from doing so should be ignored` mindset may lead to the user making decision based on the wrong data (because actually the program needs to do 1 thing - to help user in making informed decisions) 2. similar cases like `bool ExecuteJob(bool throwOnError = false)` are usually extremely errorprone and lead to code that is difficult to reason about just by reading it. – Eugene Podskal Feb 07 '19 at 20:35
  • @EugenePodskal I think the assumption is that "show data" means "show correct data." The questioner doesn't say the finished product doesn't work, just that it's might be written "wrong." I'd have to see some hard data on the second point. I have a handful of heavily-used functions in my current project that have a throw/no-throw switch and are no harder to reason about than any other function, but that's one data point. – Blrfl Feb 10 '19 at 11:07
  • Good answer, I think this logic applies to a much wider range of circumstances than just the OP's. BTW, cpp's new has a throw/no-throw version for exactly these reasons. I know there are some small differences but... – drjpizzle Feb 21 '19 at 00:18
8

Safety critical and 'normal' code can lead to very different ideas of what 'good practice' looks like. There's a lot of overlap - some stuff is risky and should be avoided in both - but there are still significant differences. If you add a requirement to be guaranteed to be responsive these deviations get quite substantial.

These often relate to things you'd expect:

  • For git, the wrong answer could be very bad relative to: taking-to-long/aborting/hanging or even crashing (which are effectively non-issues relative to, say, altering checked-in code accidentally).

    However: For an instrument panel having a g-force calculation stalling and preventing an air-speed calculation being made maybe unacceptable.

Some are less obvious:

  • If you have tested a lot, first order results (like right answers) are not as big a worry relatively speaking. You know your testing will have covered this. However if there was hidden state or control flow, you don't know this won't be the cause of something a lot more subtle. This is hard to rule out with testing.

  • Being demonstrably safe is relatively important. Not many customers will sit down to reason about whether the source they are buying is safe or not. If you are in the aviation market on the other-hand...

How does this apply to your example:

I don't know. There are a number of thought processes that might have lead rules like "No-one throw in production code" being adopted in safety critical code that would be pretty silly in more usual situations.

Some will relate to being embedded, some safety and maybe others... Some are good (tight performance/memory bounds were needed) some are bad (we don't handle exceptions properly so best not risk it). Most of the time even knowing why they did it won't really answer the question. For example if it is to do with the ease of auditing the code more that actually making it better, is it good practice? You really can't tell. They are different animals, and need treating differently.

All of that said, it looks a tad suspect to me BUT:

Safety critical software and software design decisions probably shouldn't be made by strangers on software-engineering stackexchange. There may well be a good reason to do this even if it's part of a bad system. Don't read to much into any of this other than as "food for thought".

drjpizzle
  • 264
  • 2
  • 6
7

Sometimes throwing an exception is not the best method. Not least due to stack unwinding, but sometimes because catching an exception is problematic, particularly along language or interface seams.

A good way to handle this is to return an enriched data-type. This data type has enough state to describe all of the happy paths, and all of the unhappy paths. The point is, if you interact with this function (member/global/otherwise) you will be forced to handle the outcome.

That being said this enriched data-type should not force action. Imagine in your area example something like var area_calc = new AreaCalculator(); var volume = area_calc.CalculateArea(x, y) * z;. Seems useful volume should contain the area multiplied by depth - that could be a cube, cylinder, etc...

But what if the area_calc service was down? Then area_calc .CalculateArea(x, y) returned a rich datatype containing an error. Is it legal to multiply that by z? Its a good question. You could force users to handle the checking immediately. This does however break up the logic with error handling.

var area_calc = new AreaCalculator();
var area_result = area_calc.CalculateArea(x, y);
if (area_result.bad())
{
    //handle unhappy path
}
var volume = area_result.value() * z;

vs

var area_calc = new AreaCalculator();
var volume = area_calc.CalculateArea(x, y) * z;
if (volume.bad())
{
    //handle unhappy path
}

The essentially logic is spread over two lines and divided by error handling in the first case, while the second case has all the relevant logic on one line followed by error handling.

In that second case volume is a rich data type. Its not just a number. This makes storage larger, and volume will still need to be investigated for an error condition. Additionally volume might feed other calculations before the user chooses to handle the error, allowing it to manifest in several disparate locations. This might be good, or bad depending on the specifics of the situation.

Alternately volume could be just a plain data type - just a number, but then what happens to the error condition? It could be that the value implicitly converts if it is in a happy condition. Should it be in an unhappy condition it might return a default/error value (for area 0 or -1 might seem reasonable). Alternately it could throw an exception on this side of the interface/language boundary.

... foo() {
   var area_calc = new AreaCalculator();
   return area_calc.CalculateArea(x, y) * z;
}
var volume = foo();
if (volume <= 0)
{
    //handle error
}

vs.

... foo() {
   var area_calc = new AreaCalculator();
   return area_calc.CalculateArea(x, y) * z;
}

try { var volume = foo(); }
catch(...)
{
    //handle error
}

By passing out a bad, or possibly bad value, it places a lot of onus on the user to validate the data. This is a source of bugs, because as far as the compiler is concerned the return value is a legitimate integer. If something was not checked you'll discover it when things go wrong. The second case mixes the best of both worlds by allowing exceptions to handle unhappy paths, while happy paths follow normal processing. Unfortunately it does force the user to handle exceptions wisely, which is hard.

Just to be clear an Unhappy path is a case unknown to business logic (the domain of exception), failing to validate is a happy path because you know how to handle that by the business rules (the domain of rules).

The ultimate solution would be one which allows all scenarios (within reason).

  • The user should be able to query for a bad condition, and handle it immediately
  • The user should be able to operate on the enriched type as if the happy path had been followed and propagate the error details.
  • The user should be able to extract the happy path value through casting (implicit/explicit as is reasonable), generating an exception for unhappy paths.
  • The user should be able to extract the happy path value, or use a default (supplied or not)

Something like:

Rich::value_type value_or_default(Rich&, Rich::value_type default_value = ...);
bool bad(Rich&);
...unhappy path report... bad_state(Rich&);
Rich& assert_not_bad(Rich&);
class Rich
{
public:
   typedef ... value_type;

   operator value_type() { assert_not_bad(*this); return ...value...; }
   operator X(...) { if (bad(*this)) return ...propagate badness to new value...; /*operate and generate new value*/; }
}

//check
if (bad(x))
{
    var report = bad_state(x);
    //handle error
}

//rethrow
assert_not_bad(x);
var result = (assert_not_bad(x) + 23) / 45;

//propogate
var y = x * 23;

//implicit throw
Rich::value_type val = x;
var val = ((Rich::value_type)x) + 34;
var val2 = static_cast<Rich::value_type>(x) % 3;

//default value
var defaulted = value_or_default(x);
var defaulted_to = value_or_default(x, 55);
Kain0_0
  • 15,888
  • 16
  • 37
  • @TobySpeight Fair enough, these things are context sensitive and have their range of scope. – Kain0_0 Feb 10 '19 at 10:42
  • I think the problem here is the 'assert_not_bad' blocks. I think these will end up as in the same place as the original code tried to solve. In testing these need to be noticed, however, if they really are asserts, they should be stripped before production on a real aircraft. otherwise some great points. – drjpizzle Feb 20 '19 at 23:49
  • @drjpizzle I would argue that if it was important enough to add a guard for testing, it is important enough to leave the guard in place when running in production. The presence of the guard itself implies doubt. If you doubt the code enough to guard it during testing, you are doubting it for a technical reason. i.e. The condition could/does realistically occur. Executing tests do not prove that the condition will never be reached in production. That means that there is a known condition, that might occur, that needs to be handled somewhere somehow. I think how it is handled, is the problem. – Kain0_0 Feb 21 '19 at 05:40
3

I'll answer from the point of view of C++. I'm pretty sure all the core concepts are transferable to C#.

It sounds like your preferred style is "always throw exceptions":

int CalculateArea(int x, int y) {
    if (x < 0 || y < 0) {
        throw Exception("negative side lengths");
    }
    return x * y;
}

This can be a problem for C++ code because exception-handling is heavy — it makes the failure case run slowly, and makes the failure case allocate memory (which sometimes isn't even available), and generally makes things less predictable. The heavyweightness of EH is one reason you hear people saying things like "Don't use exceptions for control flow."

So some libraries (such as <filesystem>) use what C++ calls a "dual API," or what C# calls the Try-Parse pattern (thanks Peter for the tip!)

int CalculateArea(int x, int y) {
    if (x < 0 || y < 0) {
        throw Exception("negative side lengths");
    }
    return x * y;
}

bool TryCalculateArea(int x, int y, int& result) {
    if (x < 0 || y < 0) {
        return false;
    }
    result = x * y;
    return true;
}

int a1 = CalculateArea(x, y);
int a2;
if (TryCalculateArea(x, y, a2)) {
    // use a2
}

You can see the problem with "dual APIs" right away: lots of code duplication, no guidance for users as to which API is the "right" one to use, and the user must make a hard choice between useful error messages (CalculateArea) and speed (TryCalculateArea) because the faster version takes our useful "negative side lengths" exception and flattens it down into a useless false — "something went wrong, don't ask me what or where." (Some dual APIs use a more expressive error type, such as int errno or C++'s std::error_code, but that still doesn't tell you where the error occurred — just that it did occur somewhere.)

If you can't decide how your code should behave, you can always kick the decision up to the caller!

template<class F>
int CalculateArea(int x, int y, F errorCallback) {
    if (x < 0 || y < 0) {
        return errorCallback(x, y, "negative side lengths");
    }
    return x * y;
}

int a1 = CalculateArea(x, y, [](auto...) { return 0; });
int a2 = CalculateArea(x, y, [](int, int, auto msg) { throw Exception(msg); });
int a3 = CalculateArea(x, y, [](int, int, auto) { return x * y; });

This is essentially what your coworker is doing; except that he's factoring out the "error handler" into a global variable:

std::function<int(const char *)> g_errorCallback;

int CalculateArea(int x, int y) {
    if (x < 0 || y < 0) {
        return g_errorCallback("negative side lengths");
    }
    return x * y;
}

g_errorCallback = [](auto) { return 0; };
int a1 = CalculateArea(x, y);
g_errorCallback = [](const char *msg) { throw Exception(msg); };
int a2 = CalculateArea(x, y);

Moving important parameters from explicit function parameters into global state is almost always a bad idea. I do not recommend it. (The fact that it's not global state in your case but simply instance-wide member state mitigates the badness a little bit, but not much.)

Furthermore, your coworker is unnecessarily limiting the number of possible error handling behaviors. Rather than permit any error-handling lambda, he's decided on just two:

bool g_errorViaException;

int CalculateArea(int x, int y) {
    if (x < 0 || y < 0) {
        return g_errorViaException ? throw Exception("negative side lengths") : 0;
    }
    return x * y;
}

g_errorViaException = false;
int a1 = CalculateArea(x, y);
g_errorViaException = true;
int a2 = CalculateArea(x, y);

This is probably the "sour spot" out of any of these possible strategies. You've taken all of the flexibility away from the end-user by forcing them to use one of your exactly two error-handling callbacks; and you've got all the problems of shared global state; and you're still paying for that conditional branch everywhere.

Finally, a common solution in C++ (or any language with conditional compilation) would be to force the user to make the decision for their entire program, globally, at compile time, so that the un-taken codepath can be optimized out entirely:

int CalculateArea(int x, int y) {
    if (x < 0 || y < 0) {
#ifdef NEXCEPTIONS
        return 0;
#else
        throw Exception("negative side lengths");
#endif
    }
    return x * y;
}

// Now these two function calls *must* have the same behavior,
// which is a nice property for a program to have.
// Improves understandability.
//
int a1 = CalculateArea(x, y);
int a2 = CalculateArea(x, y);

An example of something that works this way is the assert macro in C and C++, which conditions its behavior on the preprocessor macro NDEBUG.

Quuxplusone
  • 457
  • 3
  • 8
  • If one returns a `std::optional` from `TryCalculateArea()` instead, it is simple to unify the implementation of both parts of the dual interface in a single function-template with a compile-time-flag. – Deduplicator Feb 08 '19 at 15:43
  • @Deduplicator: Maybe with a `std::expected`. With just `std::optional`, unless I misunderstand your proposed solution, it would still suffer from what I said: _the user must make a hard choice between useful error messages and speed, because the faster version takes our useful `"negative side lengths"` exception and flattens it down into a useless `false` — "something went wrong, don't ask me what or where."_ – Quuxplusone Feb 08 '19 at 15:55
  • That's why [libc++ ](https://github.com/llvm-mirror/libcxx/blob/master/src/filesystem/operations.cpp) actually does something very close to OP's coworker's pattern: it pipes `std::error_code *ec` all the way down through every level of the API, and then at the bottom does the moral equivalent of `if (ec == nullptr) throw something; else *ec = some error code`. (It abstracts the actual `if` into something called `ErrorHandler`, but it's the same basic idea.) – Quuxplusone Feb 08 '19 at 15:57
  • Well that would be an option to keep the extended error-info without throwing. May be appropriate, or not worth the potential additional cost. – Deduplicator Feb 08 '19 at 16:01
  • 1
    So many good thoughts contained in this answer... Definitely needs more upvotes :-) – cmaster - reinstate monica Feb 10 '19 at 14:38
1

I feel it should be mentioned where your colleague got their pattern from.

Nowadays, C# has the TryGet pattern public bool TryThing(out result). This lets you get your result, while still letting you know if that result is even a valid value. (For example, all int values are valid results for Math.sum(int, int), but if the value were to overflow, this particular result could be garbage). This is a relatively new pattern though.

Before the out keyword, you either had to throw an exception (expensive, and the caller has to catch it or kill the whole program), create a special struct (class before class or generics were really a thing) for each result to represent the value and possible errors (time consuming to make and bloats software), or return a default "error" value (that might not have been an error).

The approach your colleague use gives them the fail early perk of exceptions while testing/debugging new features, while giving them the runtime safety and performance (performance was a critical issue all the time ~30 years ago) of just returning a default error value. Now this is the pattern the software was written in, and the expected pattern moving forward, so it's natural to keep doing it this way even though there are better ways now. Most likely this pattern was inherited from the age of the software, or a pattern your colleges just never grew out off (old habits are hard to break).

The other answers already cover why this is considered bad practice, so I will just end on recommending you read up on TryGet pattern (maybe also encapsulation for what promises an object should make to it's caller).

david
  • 103
  • 2
Tezra
  • 238
  • 1
  • 10
  • Before the `out` keyword, you'd write a `bool` function that takes a pointer to the result, i.e. a `ref` parameter. You could do that in VB6 in 1998. The `out` keyword merely buys you compile-time certainty that the parameter is assigned when the function returns, that's all there is to it. It *is* a nice & useful pattern though. – Mathieu Guindon Feb 08 '19 at 00:34
  • @MathieuGuindon Yeay, but GetTry wasn't yet a well known/established pattern, and even if it was, I'm not entirely sure it would have been used. After all, part of the lead up to Y2K was that storing anything larger than 0-99 was unacceptable. – Tezra Feb 08 '19 at 13:16
0

This specific example has an interesting feature that may affect the rules...

CalculateArea(int x, int y)
{
    if(x < 0 || y < 0)
    {
        if(shouldThrowExceptions) 
            throwException;
        else
            return 0;
    }
}

What I see here is a precondition check. A failing precondition check implies a bug higher up in the call stack. Thus, the question becomes is this code responsible for reporting bugs located elsewhere?

Some of the tension here is attributable to the fact that this interface exhibits primitive obsession -- x and y are presumably supposed to represent real measurements of length. In a programming context where domains specific types are a reasonable choice, we would in effect move the precondition check closer to the source of the data -- in other words, kick the responsibility for data integrity further up the call stack, where we have a better sense for the context.

That said, I don't see anything fundamentally wrong with having two different strategies for managing a failed check. My preference would be to use composition to determine which strategy is in use; the feature flag would be used in the composition root, rather than in the implementation of the library method.

// Configurable dependencies
AreaCalculator(PreconditionFailureStrategy strategy)

CalculateArea(int x, int y)
{
    if (x < 0 || y < 0) {
        return this.strategy.fail(0);
    }
    // ...
}

They have worked on critical applications dealing with aviation where the system could not go down.

The National Traffic and Safety Board is really good; I might suggest alternative implementation techniques to the graybeards, but I'm not inclined to argue with them about designing bulkheads in the error reporting subsystem.

More broadly: what's the cost to the business? It's a lot cheaper to crash a web site than it is a life critical system.

VoiceOfUnreason
  • 32,131
  • 2
  • 42
  • 79
0

There are times when you want to do his approach, but I would not consider them to be the "normal" situation. The key to determining which case you are in is:

His logic/reasoning is that our program needs to do 1 thing, show data to user. Any other exception that doesn't stop us from doing so should be ignored.

Check the requirements. If your requirements actually say that you have one job, which is to show data to the user, then he is right. However, in my experience, the majority of times the user also cares what data is shown. They want the correct data. Some systems do just want to fail quietly and let a user figure out that something went wrong, but I'd consider them the exception to the rule.

The key question I would ask after a failure is "Is the system in a state where the user's expectations and the software's invariants are valid?" If so, then by all means just return and keep going. Practically speaking this is not what happens in most programs.

As for the flag itself, the exceptions flag is typically considered code smell because a user needs to somehow know which mode the module is in in order to understand how the function operates. If it's in !shouldThrowExceptions mode, the user needs to know that they are responsible for detecting errors and maintaining expectations and invariants when they occur. They are also responsible right-then-and-there, at the line where the function gets called. A flag like this is typically highly confusing.

However, it does happen. Consider that many processors permit changing the floating point behavior within the program. A program that wishes to have more relaxed standards may do so simply by changing a register (which is effectively a flag). The trick is that you should be very cautious, to avoid accidentally stepping on others toes. Code will often check the current flag, set it to the desired setting, do operations, then set it back. That way nobody gets surprised by the change.

Cort Ammon
  • 10,840
  • 3
  • 23
  • 32
-1

Methods either handle exceptions or they don't, there are no need for flags in languages like C#.

public int Method1()
{
  ...code

 return 0;
}

If something goes bad in ...code then that exception will need to be handled by the caller. If no one handles the error, the program will terminate.

public int Method1()
{
try {  
...code
}
catch {}
 ...Handle error 
}
return 0;
}

In this case, if something bad happens in ...code, Method1 is handling the problem and the program should proceed.

Where you handle exceptions is up to you. Certainly you can ignore them by catching and doing nothing. But, I would make sure you are only ignoring certain specific types of exceptions that you can expect to occur. Ignoring (exception ex) is dangerous because some exceptions you do not want to ignore like system exceptions regarding out of memory and such.

Jon Raynor
  • 10,905
  • 29
  • 47
  • 3
    The current setup that OP posted pertains to deciding whether to willingly throw an exception. OP's code does not lead to unwanted swallowing of things like out of memory exceptions. If anything, the assertion that exceptions crash the system implies that the code base **does not catch** exceptions and thus will not swallow any exceptions; both those that were and weren't thrown by OP's business logic intentionally. – Flater Feb 06 '19 at 09:14
-1

This approach breaks the "fail fast, fail hard" philosophy.

Why you want to fail fast:

  • The faster you fail, the nearer the visible symptom of the fail is to the actual cause of the failure. This makes debugging much easier - in the best case, you have the error line right in the first line of your stack trace.
  • The faster you fail (and catch the error appropriately), the less probable is it that you confuse the rest of your program.
  • The harder you fail (i.e., throw an exception instead of just returning a "-1" code or something like that), the more likely is that the caller actually cares about the error, and does not just keep working with wrong values.

Drawbacks of not failing fast and hard:

  • If you avoid visible failures, i.e. pretend that everything is fine, you tend to make it incredibly hard to find the actual error. Imagine that the return value of your example is part of some routine which calculates the sum of 100 areas; i.e., calling that function 100 times and summing the return values. If you silently suppress the error, there is no way at all to find where the actual error occurs; and all following calculations will be silently wrong.
  • If you delay failing (by returning an impossible return value like "-1" for an area), you increase the likelihood that the caller of your function just does not bother about it and forgets to handle the error; even though they have the information about the failure at hand.

Finally, the actual exception-based error handling has the benefit that you can give an "out of band" error message or object, you can easily hook in logging of errors, alerting, etc. without writing a single extra line in your domain code.

So, there are not only simple technical reasons, but also "system" reasons which make it very useful to fail fast.

At the end of the day, not failing hard and fast in our current times, where exception handling is lightweight and very stable is just halfway criminal. I understand perfectly where the thinking that it's good to suppress exceptions is comming from, but it's just not applicable anymore.

Especially in your particular case, where you even give an option about whether to throw exceptions or not: this means that the caller has to decide anyways. So there is no drawback at all to have the caller catch the exception and handle it appropriately.

One point comming up in a comment:

It has been pointed out in other answers that failing fast and hard is not desirable in a critical application when the hardware you're running on is an airplane.

Failing fast and hard does not mean that your whole application crashes. It means that at the point where the error occurs, it is locally failing. In the OP's example, the low level method calculating some area should not silently replace an error by a wrong value. It should fail clearly.

Some caller up the chain obviously has to catch that error/exception and handle it appropriately. If this method was used in an airplane, this should probably lead to some error LED lighting up, or at least to display "error calculating area" instead of a wrong area.

AnoE
  • 5,614
  • 1
  • 13
  • 17
  • 3
    It has been pointed out in other answers that failing fast and hard is not desirable in a critical application when the hardware you're running on is an airplane. – HAEM Feb 08 '19 at 14:57
  • 1
    @HAEM, then that is a misunderstanding on what failing fast and hard means. I have added a paragraph about this to the answer. – AnoE Feb 08 '19 at 17:14
  • Even if the intention is not to fail 'that hard', it's still seen as risky to be playing with that kind of fire. – drjpizzle Feb 18 '19 at 15:58
  • That's kind of the point, @drjpizzle. If you are used to fail hard-fail fast, it is not "risky" or "playing with that kind of fire" in my experience. Au contraire. It means you get used to thinking about "what would happen if I got an exception here", where "here" means *everywhere*, and you tend to be aware whether the spot where you are currently programming at is going to have a major problem (plane crash, whatever in that case). Playing with fire would be to expect everything mostly going OK, and every component, in doubt, pretending that everything is fine... – AnoE Feb 18 '19 at 18:39