Should I add redundant code now just in case it may be needed in the future?

Question

Rightly or wrongly, I'm currently of the belief that I should always try to make my code as robust as possible, even if this means adding in redundant code / checks that I know won't be of any use right now, but they might be x amount of years down the line.

For example, I'm currently working on a mobile application that has this piece of code:

public static CalendarRow AssignAppointmentToRow(Appointment app, List<CalendarRow> rows)
{
    //1. Is rows equal to null? - This will be the case if this is the first appointment.
    if (rows == null) {
        rows = new List<CalendarRow> ();
    }

    //2. Is rows empty? - This will be the case if this is the first appointment / some other unknown reason.
    if(rows.Count == 0)
    {
        rows.Add (new CalendarRow (0));
        rows [0].Appointments.Add (app);
    }

    //blah...
}

Looking specifically at section two, I know that if section one is true, then section two will also be true. I can't think of any reason as to why section one would be false and section two evaluate true, which makes the second if statement redundant.

However, there may be a case in the future where this second if statement is actually needed, and for a known reason.

Some people may look at this initially and think I'm programming with the future in-mind, which is obviously a good thing. But I know of a few instances where this kind of code has "hidden" bugs from me. Meaning it's taken me even longer to figure out why function xyz is doing abc when it should actually be doing def.

On the other hand, there's also been numerous instances where this kind of code has made it much, much easier to enhance the code with new behaviours, because I don't have to go back and ensure that all the relevant checks are in place.

Are there any general rule-of-thumb guidelines for this kind of code? (I'd also be interested to hear if this would be considered good or bad practise?)

N.B: This could be considered similar to this question, however unlike that question, I'd like an answer assuming there are no deadlines.

TLDR: Should I go so far as to adding redundant code in order to make it potentially more robust in the future?

Possible duplicate of [Design for future changes or solve the problem at hand](http://programmers.stackexchange.com/questions/59810/design-for-future-changes-or-solve-the-problem-at-hand) — gnat, Mar 27 '16 at 17:55
In software development things that should never happen happen all the time. — Roman Reiner, Mar 27 '16 at 17:56
@gnat - It's very similar, but I'm kind of targeting `checks` only here, not asking whether I should add in / change actual functionality just in case it might be needed in the future. — KidCode, Mar 27 '16 at 17:58
If you know `if(rows.Count == 0)` will never happen, then you could raise an exception when it happens - and check then why your assumption became wrong. — knut, Mar 27 '16 at 18:01
Irrelevant to the question, but I suspect your code has a bug. When rows is null a new list is created and (I am guessing) thrown away. But when rows is not null, an existing list is changed. A better design might be to insist that the client pass in a list that may or may not be empty. — Theodore Norvell, Mar 27 '16 at 19:26
Sure, "... if section one is true, then section two will also be true." but what if section one is false? Then section two might be either true or false. — Daniel T., Mar 27 '16 at 19:52
I'd say that there's quite a difference between 'defensive code' (which is what it looks like you're doing here) versus 'future code'. But in any case, you might find this blog post interesting on the subject of future/YAGNI code: http://www.sebastiansylvan.com/post/the-perils-of-future-coding/ — Ben Cottrell, Mar 27 '16 at 20:50
You do realise that EVERY TIME you hit first `if`, you'll also hit the second. Which means that you actually need the second one as well if your code depends on it. — Robert Koritnik, Mar 28 '16 at 05:57
Why would `rows` ever be null? There's never a good reason, at least in .NET, for a collection to be null. *Empty*, sure, but not *null*. I would throw an exception if `rows` is null, because it means there's a lapse in logic by the caller. — Kyralessa, Mar 28 '16 at 16:30
@Kyralessa The reasons for other variables to be null still apply to collections; it's "don't use null to represent an empty collection" not "never let collection references be null." — user253751, Mar 28 '16 at 20:19
That said, the code in question appears to be using null to represent an empty collection. — user253751, Mar 28 '16 at 20:19
A quick note: you always have deadlines. Even when there's no customer-set deadline, you have to account for opportunity costs - whatever you do, you could have been doing something else in the meantime. If that something else has more value, you're losing value by not doing it. And if you never ship... :) — Luaan, Mar 29 '16 at 09:07
@Luaan that's the usual horribly wrong excuse for doing no tests. Time not spent now could result in more time required in the future. — Pierre Arlaud, Mar 29 '16 at 09:14
Related talk in a PyCon: [Stop writing classes](https://www.youtube.com/watch?v=o9pEzgHorH0) — Davidmh, Mar 29 '16 at 09:21
@PierreArlaud Yes, that's also a cost you have to account for - if you don't, that's just bad economy and your own fault. Plenty of products and companies died of either of these causes. But in the end, I'm pretty sure that most products are on the "horrible" side - if you go too far in ignoring good code, testing etc., you're going to have grumpy programmers, but it rarely puts you out of business, compared to the other "extreme". You **have** to ship. It's always a bet, you're just picking the stakes and hoping for the best. It's still a business, and you're still resource limited. — Luaan, Mar 29 '16 at 11:54
@Luaan agreed, which is why TDD exists: build tests first, then the feature. You ensure that the feature is there, but that tests aren't omitted because of "lack of time". — Pierre Arlaud, Mar 29 '16 at 11:58
@PierreArlaud Yup. It's more like a trick played on the management, though - it's very easy to do TDD with no benefit at all ("test count/coverage as a metric"), while producing more code that needs to be maintained. In the end, you still need great programmers that care, and you still need to keep making sure that your tests are correct and relevant. I'm not yet at the point where I'd consider 100% TDD worthwhile, though I keep watching its evangelists periodically; but I can certainly see the appeal if you're on a team where management doesn't communicate with developers (QA, SE, UX, ...). — Luaan, Mar 29 '16 at 12:08
I find the logic in your function to be dubious. After section 2; If rows was null or empty you now have rows with App already added before you proceed with the rest of the code. If rows was non empty you do not have App added, and you proceed with the same code. — Taemyr, Mar 29 '16 at 13:59
@Kyralessa Imagine you have several persons you are tracking, and any one *can* have a calendar, but only a few of them will have a calendar. In such a scenario it can make sense to only initialize the calendar when it's needed. — Taemyr, Mar 29 '16 at 14:04
@KidCode: You actually answer the question yourself, by observing that this kind of code tends to hide bugs. Hiding bugs does not make the code more robust! Hiding a bug in the present because of a hypothetical future feature is like driving the car into a river because there is plans to build a bridge. — JacquesB, Mar 29 '16 at 22:16
There's a balance between YAGNI and defensive coding. The artistry is in finding that balance. — CodeGnome, Mar 30 '16 at 22:36
I don't like the word "redundant" in this context. I'd use (potentially) unnecessary. — Alex, Mar 31 '16 at 09:38
Much to learn you have, young Padawon. *much, much easier to enhance the code with new behaviours, because I don't have to go back and ensure that all the relevant checks are in place.* ... less one thing. — radarbob, Apr 01 '16 at 00:47
It's bull when it's not initalized. Obly initalized objects have properties like .count() — BlueWizard, Apr 04 '16 at 05:14
Unless and until the requirements change, it is redundant and useless noise. After all, currently it will never be empty - you can't predict future changes, so how could you possibly know why it would be empty? And if you don't know why it would be empty, how can you write code to handle it? — Kevin, Apr 23 '16 at 16:06
@Taemyr, sure, but we're talking about a collection. Even if you were collecting the calendars and then passing them, you wouldn't pass a null to say "None of these people have calendars." If you passed anything, it'd be an empty collection. You also wouldn't pass a collection containing either a null or a calendar for each person. You'd just pass the calendars that exist. I've never seen a single good reason in .NET to pass a null instead of a collection, or to pass a collection that contains any nulls. — Kyralessa, Jun 03 '16 at 19:47
@Kyralessa We are passing a List of CalendarRows. Not a list of Calendars. I agree that a class Calendar would be cleaner code, but that is not what OP is using. — Taemyr, Jun 04 '16 at 15:32
@Taemyr, but the same principle holds true. Though looking more closely at the code, it looks problematic from other points of view. Since `rows` isn't passed with `ref`, initializing it if it's null won't have any effect on that argument when the method returns. The method should probably act on the thing containing the `List` rather than on that list itself. — Kyralessa, Jun 04 '16 at 23:24

Eric Lippert · Accepted Answer · 2016-03-30T19:49:38.980

As an exercise, first let's verify your logic. Though as we'll see, you have bigger problems than any logical problem.

Call the first condition A and the second condition B.

You first say:

Looking specifically at section two, I know that if section one is true, then section two will also be true.

That is: A implies B, or, in more basic terms (NOT A) OR B

And then:

I can't think of any reason as to why section one would be false and section two evaluate true, which makes the second if statement redundant.

That is: NOT((NOT A) AND B). Apply Demorgan's law to get (NOT B) OR A which is B implies A.

Therefore, if both your statements are true then A implies B and B implies A, which means they must be equal.

Therefore yes, the checks are redundant. You appear to have four code paths through the program but in fact you only have two.

So now the question is: how to write the code? The real question is: what is the stated contract of the method? The question of whether the conditions are redundant is a red herring. The real question is "have I designed a sensible contract, and does my method clearly implement that contract?"

Let's look at the declaration:

public static CalendarRow AssignAppointmentToRow(
    Appointment app,    
    List<CalendarRow> rows)

It's public, so it has to be robust to bad data from arbitrary callers.

It returns a value, so it should be useful for its return value, not for its side effect.

And yet the name of the method is a verb, suggesting that it is useful for its side effect.

The contract of the list parameter is:

A null list is OK
A list with one or more elements in it is OK
A list with no elements in it is wrong and should not be possible.

This contract is insane. Imagine writing the documentation for this! Imagine writing test cases!

My advice: start over. This API has candy machine interface written all over it. (The expression is from an old story about the candy machines at Microsoft, where both the prices and the selections are two-digit numbers, and it is super easy to type in "85" which is the price of item 75, and you get the wrong item. Fun fact: yes, I have actually done that by mistake when I was trying to get gum out of a vending machine at Microsoft!)

Here's how to design a sensible contract:

Make your method useful for either its side effect or its return value, not both.

Do not accept mutable types as inputs, like lists; if you need a sequence of information, take an IEnumerable. Only read the sequence; do not write to a passed-in collection unless it is very clear that this is the contract of the method. By taking IEnumerable you send a message to the caller that you are not going to mutate their collection.

Never accept nulls; a null sequence is an abomination. Require the caller to pass an empty sequence if that makes sense, never ever null.

Crash immediately if the caller violates your contract, to teach them that you mean business, and so that they catch their bugs in testing, not production.

Design the contract first to be as sensible as possible, and then clearly implement the contract. That is the way to future-proof a design.

Now, I've talked only about your specific case, and you asked a general question. So here is some additional general advice:

If there is a fact that you as a developer can deduce but the compiler cannot, then use an assertion to document that fact. If another developer, like future you or one of your coworkers, violates that assumption then the assertion will tell you.
Get a test coverage tool. Make sure your tests cover every line of code. If there is uncovered code then either you have a missing test, or you have dead code. Dead code is surprisingly dangerous because usually it is not intended to be dead! The incredibly awful Apple "goto fail" security flaw of a couple years back comes immediately to mind.
Get a static analysis tool. Heck, get several; every tool has its own particular specialty and none is a superset of the others. Pay attention to when it is telling you that there is unreachable or redundant code. Again, those are likely bugs.

If it sounds like I'm saying: first, design the code well, and second, test the heck out of it to make sure it is correct today, well, that's what I'm saying. Doing those things will make dealing with the future much easier; the hardest part about the future is dealing with all the buggy candy machine code people wrote in the past. Get it right today and the costs will be lower in the future.

I've never really thought about methods like this before, thinking about it now I always seem to try and cover every eventuality, when in reality if the person / method calling my method doesn't pass me what I need, then I shouldn't attempt to fix their mistakes (when I don't actually know what they intended), I should just crash out. Thanks for this, a valuable lesson has been learnt! — KidCode, Mar 28 '16 at 10:34
The fact that a method returns a value doesn't imply it shouldn't also be useful for its side-effect. In concurrent code, the ability of functions to return both is often vital (imagine `CompareExchange` without such an ability!), and even in non-concurrent scenarios something like "add a record if it doesn't exist, and return either the passed-in record or the one that existed" can be more convenient than any approach that doesn't employ both a side-effect and a return value. — supercat, Mar 28 '16 at 16:04
@KidCode Yeah, Eric's really good at explaining things clearly, even some really complicated topics. :) — Mason Wheeler, Mar 28 '16 at 21:29
@KidCode This kind of thing is why exceptions exist. Unexpected exceptions crash a program and say very clearly: "Hey, developer, you tried to do something that was nonsense. You have a bug. Please fix." — doppelgreener, Mar 29 '16 at 06:15
@supercat Sure, but it's also harder to reason about. First, you'll probably want to look at something that doesn't modify a global state, thus avoiding both concurrency issues and state corruption. When this isn't reasonable, you'll still want to isolate the two - that gives you clear places where concurrency is an issue (and thus be considered extra dangerous), and places where it's handled. This was one of the core ideas in the original OOP paper, and is at the core of the usefulness of actors. No religious rule - just prefer separating the two when it makes sense. It usually does. — Luaan, Mar 29 '16 at 09:02
@Eric Thank you for sharing your experience with us. TBH, I was ready to upvote it when I reached Demorgan.. Little did I know that the best was yet to come :) This answer is just great and very educational! — Dev-iL, Mar 30 '16 at 19:35
Throwing exceptions for inappropriate input **is** covering every eventuality, but its covering some of those eventualities with "bad caller, no biscuit!". — Jon Hanna, Mar 31 '16 at 12:38
The conditionals may be redundant logically, but the code blocks are not redundant. If `rows == null` then you should execute all the code shown. — radarbob, Apr 01 '16 at 01:08

score 89 · Answer 2 · edited Mar 28 '16 at 21:34

89

What you are doing in the code you're showing above is not so much future proofing as much as it is defensive coding.

Both if statements test for different things. Both are appropriate tests depending on your needs.

Section 1 tests for and corrects a null object. Side note: Creating the list doesn't create any child items (e.g., CalendarRow).

Section 2 tests for and corrects user and/or implementation errors. Just because you have a List<CalendarRow> doesn't mean that you have any items in the list. Users and implementors will do things that you can't imagine just because they are allowed to do it, whether it makes sense to you or not.

edited Mar 28 '16 at 21:34

5gon12eder

6,956
2
23
29

answered Mar 27 '16 at 18:39

Adam Zuckerman

3,715
1
19
27

I disagree on the part about "stupid user tricks": If the example routine is a part of a class whose specific purpose is to manage that `List`, it is perfectly valid for that class to define consistency requirements that forbid the existence of an empty list. There is no real user code involved in such a setting, only code that belongs to the class itself. Of course, that should be properly documented (for example via `assert()`). – cmaster - reinstate monica Mar 27 '16 at 19:23
1

Actually I'm taking 'stupid user tricks' to mean input. YES you should never trust input. Even from just outside your class. Validate! If you think this is only a future concern I'd be happy to hack you today. – candied_orange Mar 27 '16 at 20:03
1

@CandiedOrange, that was the intent, but the wording didn't convey the attempted humor. I have changed the wording. – Adam Zuckerman Mar 27 '16 at 20:09
3

Just a quick question here, if it's an implementation error / bad data, shouldn't I just crash out, instead of trying to fix their errors? – KidCode Mar 28 '16 at 10:38
4

@KidCode Whenever you attempt to recover, "fix their errors", you need to do two things, return to a known good state and not lose valuable input quietly. Following that rule in this case begs the question: is a zero row list valuable input? – candied_orange Mar 28 '16 at 11:56
8

Disagree strongly with this answer. If a function receives invalid input it means by definition there is a bug in the program. The correct approach is to throw an exception on invalid input, so you discover the issue and fix the bug. The approach you describe is just going to hide bugs which makes them more insidious and harder to track down. Defensive coding means you don't automatically trust input but validate it, but it does not mean you should make random guesses in order to "fix" invalid or unexpected input. – JacquesB Mar 30 '16 at 09:02
@JacquesB, while it is probable that receiving a wrong input **indicates** a bug, it could also indicate bad user data or a failure of other prior methods that didn't code defensively. Based on the fact that the code example clearly indicates this is a very small portion of a larger method, we cannot **know** if more appropriate action has been take to cover the input. **No where in my answer am I telling the OP what to do.** I am offering a better description of what type of coding is being used. – Adam Zuckerman Mar 30 '16 at 13:15
@AdamZuckerman: If the unexpected/invalid input is due to some other method failing to validate input and passing on invalid data, then it is *still* an indication of a bug, and the appropriate reaction is to throw an exception, so the bug can be discovered tracked down. Hiding the bug further helps absolutely no one. – JacquesB Mar 30 '16 at 15:45
1

We also don't know what the OP's requirements are. They might include "don't throw errors". While this is short sighted by the business, it has been my experience that business types don't want an application to die just because inappropriate or non-validated data was entered. – Adam Zuckerman Mar 30 '16 at 15:49
1

@AdamZuckerman: It is totally sensible to require that an application should gracefully handle any kind of user/external input, even malformed or invalid, without crashing. But in this case the input in question is not external but created by the program itself (since it is a complex type), so if it is invalid it is due to a bug in the program itself. – JacquesB Mar 30 '16 at 20:47

score 35 · Answer 3 · answered Mar 27 '16 at 18:04

35

I guess, this question is basically down to taste. Yes, it is a good idea to write robust code, yet the code in your example is a slight violation of the KISS principle (as a lot of such "future proof" code will be).

I personally would likely not bother making code bullet-proof for the future. I don't know the future, and as such, any such "future bullet-proof" code is doomed to fail miserably when the future arrives anyway.

Instead, I would prefer a different approach: Make the assumptions that you make explicit using the assert() macro or similar facilities. That way, when the future comes around the door, it will tell you precisely where your assumptions do not hold any more.

answered Mar 27 '16 at 18:04

cmaster - reinstate monica

7,941
1
22
25

4

I like your point about not knowing what the future holds. Right now all I'm really doing is guessing what could be a problem, then guessing again for the solution. – KidCode Mar 27 '16 at 18:12
3

@KidCode: Good observation. Your own thinking here is actually a lot smarter than many of the answers here, including the one you have accepted. – JacquesB Mar 30 '16 at 15:49
1

I like this answer. Keep the code minimal so it's easy for a future reader to understand why any checks are required. If a future reader sees checks for things that look unnecessary, they may waste time trying to understand why the check is there. A future human may be modifying this class, rather than others that use it. Also, **don't write code you can't debug**, which will be the case if you try to handle cases that can't currently happen. (Unless you write unit tests that exercise code paths that the main program doesn't.) – Peter Cordes Mar 31 '16 at 02:39

score 24 · Answer 4 · edited Mar 28 '16 at 15:31

Another principle you might want to think about is the idea of failing fast. The idea is that when something goes wrong in your program, you want to stop the program altogether immediately, at least while you're developing it before releasing it. Under this principal, you want to write lots of checks to make sure your assumptions hold, but seriously consider having your program stop dead in its tracks whenever the assumptions are violated.

To put it boldly, if there's even a small error in your program, you want it to completely crash while you watch!

This might sound counterintuitive, but it lets you discover bugs as quickly as possible during routine development. If you're writing a piece of code, and you think it's finished, but it crashes when you test it, there's no question that you're not finished yet. Furthermore, most programming languages give you excellent debugging tools that are easiest to use when your program crashes completely instead of trying to do its best after an error. The biggest, most common example is that if you crash your program by throwing an unhandled exception, the exception message will tell you an incredible amount of information about the bug, including which line of code failed and the path through your code the program took on its way to that line of code (the stack trace).

For more thoughts, read this short essay: Don't Nail Your Program in the Upright Position

This is relevant to you because it's possible that sometimes, the checks you're writing are there because you want your program to try to keep running even after something's gone wrong. For example, consider this brief implementation of the Fibonacci sequence:

// Calculates the nth Fibonacci number
int fibonacci(int n) {
    int a = 0;
    int b = 1;

    for(int i = 0; i < n; i++) {
        int temp = b;
        b = a + b;
        a = temp;
    }

    return b;
}

This works, but what if someone passes a negative number to your function? It won't work then! So you'll want to add a check to make sure that the function is called with a nonnegative input.

It might be tempting to write the function like this:

// Calculates the nth Fibonacci number
int fibonacci(int n) {
    int a = 0;
    int b = 1;

    // Make sure the input is nonnegative
    if(n < 0) {
        n = 1; // Replace the negative input with an input that will work
    }

    for(int i = 0; i < n; i++) {
        int temp = b;
        b = a + b;
        a = temp;
    }

    return b;
}

However, if you do this, then later accidentally call your Fibonacci function with a negative input, you'll never realize it! Worse, your program will probably keep on running but start producing nonsensical results without giving you any clues as to where something went wrong. These are the hardest types of bugs to fix.

Instead, it's better to write a check like this:

// Calculates the nth Fibonacci number
int fibonacci(int n) {
    int a = 0;
    int b = 1;

    // Make sure the input is nonnegative
    if(n < 0) {
        throw new ArgumentException("Can't have negative inputs to Fibonacci");
    }

    for(int i = 0; i < n; i++) {
        int temp = b;
        b = a + b;
        a = temp;
    }

    return b;
}

Now, if you ever accidentally call the Fibonacci function with a negative input, your program will immediately stop and let you know that something's wrong. Furthermore, by giving you a stack trace, the program will let you know which part of your program tried to execute the Fibonacci function incorrectly, giving you an excellent starting point for debugging what's wrong.

Doesn't c# have a particular kind of exception to indicate invalid arguments or out-of-range arguments? — JDługosz, Mar 28 '16 at 08:06
@JDługosz Yep! C# has an [ArgumentException](https://msdn.microsoft.com/en-us/library/system.argumentexception.aspx), and Java has an [IllegalArgumentException](https://docs.oracle.com/javase/8/docs/api/java/lang/IllegalArgumentException.html). — Kevin, Mar 28 '16 at 10:30
The question was using c#. Here's [C++ (link to summary)](http://en.cppreference.com/w/cpp/error/exception) for completeness. — JDługosz, Mar 28 '16 at 13:59
Hmmm, Could make the function valid over a wider domain [Fibonacci numbers: Extension to negative integers](https://en.wikipedia.org/wiki/Generalizations_of_Fibonacci_numbers#Extension_to_negative_integers) — chux - Reinstate Monica, Mar 28 '16 at 20:07
"Fail fast to the nearest safe state" is more reasonable. If you make your application so that it crashes when something unexpected happens, you risk losing the user's data. Places where the user's data is in jeopardy are great points for "penultimate" exception handling (there's always cases where you can do nothing but throw your hands up - the ultimate crash). Only making it crash in debug is just opening another can of worms - you need to test what you're deploying to the user anyway, and now you're spending half of your testing time on a version the user is never going to see. — Luaan, Mar 29 '16 at 08:52
@Luaan: that's not the responsibility of the example function, though. — whatsisname, Mar 29 '16 at 23:26
@whatsisname: Why not? I mean, whose responsibility it is to handle an error depends on how you design your application.. You might do it in the example function or in the main function, or somewhere else. — Giorgio, Mar 30 '16 at 20:02
a slight optimization improvement in this answer's **3rd** code snippet. By moving the `if(n < 0)` check in the very start of the function, two integer declarations can be saved. — user1451111, Jul 02 '18 at 15:33

score 11 · Answer 5 · answered Mar 28 '16 at 16:09

Should you add redundant code? No.

But, what you describe is not redundant code.

What you describe is programming defensively against calling code violating the preconditions of your function. Whether you do this or simply leave it up to users to read the documentation and avoid those violations themselves, is entirely subjective.

Personally, I am a big believer in this methodology, but as with everything you have to be cautious. Take, for example, C++'s std::vector::operator[]. Putting aside VS's debug-mode implementation for a moment, this function does not perform bounds checking. If you request an element that doesn't exist, the result is undefined. It is up to the user to provide a valid vector index. This is quite deliberate: you can "opt in" to bounds checking by adding it at the callsite, but if the operator[] implementation were to perform it then you would not be able to "opt out". As a fairly low-level function, this makes sense.

But if you were writing an AddEmployee(string name) function for some higher-level interface, I'd fully expect this function to at least throw an exception if you provided an empty name, as well as this precondition being documented immediately above the function declaration. You may not be providing unsanitised user input to this function today, but making it "safe" in this manner means any precondition violations that pop up in the future can be easily diagnosed, rather than potentially triggering a chain of dominoes of hard-to-detect bugs. This is not redundancy: it's diligence.

If I had to come up with a general rule (although, as a general rule, I try to avoid those), I'd say that a function which satisfies any of the following:

lives in a super-high-level language (say, JavaScript rather than C)
sits at an interface boundary
is not performance-critical
directly accepts user input

… can benefit from defensive programming. In other cases, you can still write assertions that fire during testing but are disabled in release builds, to further increase your ability to find bugs.

This topic is explored further on Wikipedia (https://en.wikipedia.org/wiki/Defensive_programming).

Tyler Durden · Answer 6 · 2016-03-27T22:40:20.273

9

Two of the ten programming commandments are relevant here:

Thou shall not assume that input is correct
Thou shall not make code for future use

Here, checking for null is not "making code for future use". Making code for future use is things like adding interfaces because you think they might be useful "some day". In other words, the commandment is to not add layers of abstraction unless they are needed right now.

Checking for null has nothing to do with future use. It has to do with Commandment #1: do not assume input will be correct. Never assume your function will receive some subset of input. A function should respond in a logical way no matter how bogus and messed up the inputs are.

edited Mar 27 '16 at 22:40

answered Mar 27 '16 at 20:15

Tyler Durden

533
3
6

5

Where are these programming commandments. Do you have a link? I'm curious because I've worked on several programs that subscribe to the second of those commandments, and several who did not. Invariably, those who subscribed to the commandment `Thou shall not make code for future use` ran into maintainability issues *sooner*, despite the intuitive logic of the commandment. I have found, in real life coding, that commandment is only effective in code where you control the feature list and deadlines, and make sure you don't need future looking code to reach them... which is to say, never. – Cort Ammon Mar 28 '16 at 14:50
1

Trivially provable: if one can estimate the probability of future use, and the projected value of the "future use" code, and the product of those two is greater than the cost of adding the "future use" code, it is statistically optimal to add the code. I think that commandment shows up in situations where developers (or managers) are forced to admit that their estimation skills are not as reliable as they would like, so as a defensive measure they just choose not to estimate future task at all. – Cort Ammon Mar 28 '16 at 14:52
2

@CortAmmon There's no place for religious commandments in programming - "best practices" only make sense in context, and learning a "best practice" without the reasoning makes you unable to adapt. I've found YAGNI very useful, but that doesn't mean that I don't think about places where adding extension points later will be expensive - it just means that I don't have to think about the simple cases ahead of time. Of course, this also changes over time, as more and more assumptions get bolted on your code, effectively increasing your code's interface - it's a balancing act. – Luaan Mar 29 '16 at 08:44
2

@CortAmmon Your "trivially provable" case ignores two very important costs - the cost of the estimation itself, and the maintenance costs of the (possibly unnecessary) extension point. That's where people get extremely unreliable estimates - by underestimating the estimates. For a very simple feature, it might be enough to think for a few seconds, but it's quite likely that you'll find a whole can of worms that follows from the initially "simple" feature. Communication is the key - as things get big, you need to talk with your leader/customer. – Luaan Mar 29 '16 at 08:49
1

@Luaan I was trying to argue for your point, there's no place for religious commandments in programming. As long as there exists one business case where the costs of estimation and maintenance of the extension point are sufficiently bounded, there is a case where said "commandment" is questionable. From my experience with code, the question of whether to leave such extension points or not has never fit nicely into a single line commandment one way or the other. – Cort Ammon Mar 29 '16 at 14:28

score 7 · Answer 7 · edited Mar 28 '16 at 13:09

The definition of 'redundant code' and 'YAGNI' often depend I find on how far you look ahead.

If you experience a problem, then you tend to write future code in such a way as to avoid that problem. Another programmer who hadn't experienced that particular issue might consider your code redundant overkill.

My suggestion is to keep track of how much time you spend on 'stuff that hasnt erred yet' if its loads and you peers are banging out features faster than you, then decrease it.

However, if you are like me, I expect you will have just typed it all out 'by default' and it hasn't really taken you any longer.

Theodore Norvell · Answer 8 · 2016-03-28T00:46:55.207

It is a good idea to document any assumptions about parameters. And it is a good idea to check that your client code does not violate those assumptions. I would do this:

/** ...
*   Precondition: rows is null or nonempty
*/
public static CalendarRow AssignAppointmentToRow(Appointment app, List<CalendarRow> rows)
{
    Assert( rows==null || rows.Count > 0 )
    //1. Is rows equal to null? - This will be the case if this is the first appointment.
    if (rows == null) {
        rows = new List<CalendarRow> ();
        rows.Add (new CalendarRow (0));
        rows [0].Appointments.Add (app);
    }

    //blah...
}

[Assuming this is C#, Assert might not be the best tool for the job, as it is not compiled in released code. But that is a debate for another day.]

Why is this better than what you wrote? Your code makes sense if, in a future where your client has changed, when the client passes in an empty list, the right thing to do will be to add a first row and add app to its appointments. But how do you know that that will be the case? It is better to make fewer assumptions now about the future.

score 5 · Answer 9 · answered Mar 27 '16 at 19:29

Estimate the cost of adding that code now. It will be relatively cheap because it's all fresh in your mind, so you will be able to do this quickly. Adding unit tests is necessary - there's nothing worse then using some method a year later, it doesn't work, and you figure out it was broken from the start and never actually worked.

Estimate the cost of adding that code when it is needed. It will be more expensive, because you have to come back to the code, remember it all, and it is a lot harder.

Estimate the probability that the additional code will actually be needed. Then do the maths.

On the other hand, code full with assumptions "X will never happen" is awful for debugging. If something doesn't work as intended, it means either a stupid mistake, or a wrong assumption. Your "X will never happen" is an assumption, and in the presence of a bug it is suspicious. Which forces the next developer to waste time on it. It's usually better to not rely on any such assumptions.

In your first paragraph, you forgot to mentioned the cost of maintaining that code over time, when it turns out that the feature that's *actually* needed is mutually exclusive with the feature that was added unnecessarily . . . — ruakh, Mar 28 '16 at 05:51
You also need to estimate the cost of the bugs which might creep into the program because you don't fail on invalid input. But you cannot estimate the cost of bugs since by definition they are unexpected. So the whole "do the math" falls apart. — JacquesB, Mar 30 '16 at 08:18

deworde · Answer 10 · 2016-03-29T06:30:36.470

The primary question here is "what will happen if you do/don't"?

As others have pointed out, this kind of defensive programming is good, but it's also occasionally dangerous.

For example, if you supply a default value, then you're keeping the program up. But the program may now not be doing what you want it to do. For example, if it writes that blank array to a file, you may now have turned your bug from "crashes because I supplied null by accident" to "clears out the calendar rows because I supplied null by accident". (for example if you start removing stuff that doesn't appear in the list in that part that reads " //blah")

The key for me is Never corrupt data. Let me repeat that; NEVER. CORRUPT. DATA. If your program exceptions out, you get a bug report you can patch; if it writes bad data to a file that it will later, you have to sow the ground with salt.

All your "unnecessary" decisions should be made with that premise in mind.

score 2 · Answer 11 · answered Mar 29 '16 at 08:36

What you're dealing with here is basically an interface. By adding the behaviour "when input is null, initialize input", you've effectively extended the method interface - now instead of always operating on a valid list, you've made it "fix up" the input. Whether this is the official or unofficial part of the interface, you can bet somebody (most likely including you) is going to use this behaviour.

Interfaces should be kept simple, and they should be relatively stable - especially in something like a public static method. You get a bit of leeway in private methods, especially private instance methods. By implicitly extending the interface, you've made your code more complex in practice. Now, imagine that you don't actually want to use that code path - so you avoid it. Now you've got a bit of untested code that's pretending like it's a part of the method's behaviour. And I can tell you right now that it probably has a bug: when you pass a list, that list is mutated by the method. However, if you don't, you create a local list, and throw it away later. This is the kind of inconsistent behaviour that will make you cry in half a year, as you try to track down an obscure bug.

In general, defensive programming is a pretty useful thing. But the code paths for the defensive checks must be tested, just like any other code. In a case like this, they are complicating your code with no reason, and I'd opt for an alternative like this instead:

if (rows == null) throw new ArgumentNullException(nameof(rows));

You don't want an input where rows are null, and you want to make the error obvious to all your callers as soon as possible.

There's many values you need to juggle around when developing software. Even robustness itself is a very complex quality - for example, I wouldn't consider your defensive check more robust than throwing an exception. Exceptions are pretty handy for giving you a safe place to try again from a safe place - data corruption issues are usually much harder to track than recognizing a problem early on and handling it safely. In the end, they only tend to give you an illusion of robustness, and then a month later you notice that a tenth of your appointments are gone, because you never noticed that a different list got updated. Ouch.

Make sure to distinguish between the two. Defensive programming is a useful technique to catch errors at a place where they are the most relevant, significantly helping your debugging efforts, and with good exception handling, preventing "sneaky corruption". Fail early, fail fast. On the other hand, what you're doing is more like "error hiding" - you're juggling inputs and making assumptions about what the caller meant. This is very important for user-facing code (e.g. spell-checking), but you should be wary when seeing this with developer-facing code.

The main problem is that whatever abstraction you make, it's going to leak ("I wanted to type ofre, not fore! Stupid spell-checker!"), and the code to handle all the special cases and fix-ups is still code you need to maintain and understand, and code you need to test. Compare the effort of making sure a non-null list is passed with fixing the bug you have a year later in production - it's not a good trade-off. In an ideal world, you'd want every method to work exclusively with its own input, returning a result and modifying no global state. Of course, in the real world, you'll find plenty of cases where that's not the simplest and clearest solution (e.g. when saving a file), but I find that keeping methods "pure" when there's no reason for them to read or manipulate global state makes code much easier to reason about. It also tends to give you more natural points for splitting your methods :)

This doesn't mean that everything unexpected should make your application crash, on the contrary. If you use exceptions well, they naturally form safe error handling points, where you can restore stable application state and allow the user to continue what they're doing (ideally while avoiding any data loss for the user). At those handling points, you'll see opportunities for fixing the issues ("No order number 2212 found. Did you mean 2212b?") or giving the user control ("Error connecting to database. Try again?"). Even when no such option is available, at least it will give you a chance that nothing got corrupted - I've started appreciating code that uses using and try...finally a lot more than try...catch, it gives you a lot of opportunities to maintain invariants even under exceptional conditions.

Users shouldn't lose their data and work. This still has to be balanced with development costs etc., but it's a pretty good general guideline (if the users decide whether to buy your software or not - internal software usually doesn't have that luxury). Even the whole application crashing becomes a lot less of a problem if the user can just restart and go back to what they were doing. This is real robustness - Word saving your work all the time without corrupting your document on disk, and giving you an option to restore those changes after restarting Word after a crash. Is it better than a no bug in the first place? Probably not - though don't forget that work spent catching a rare bug might be better spent everywhere. But it's much better than the alternatives - e.g. a corrupted document on disk, all the work since last save lost, document auto-replaced with changes before crash which just happened to be Ctrl+A and Delete.

score 1 · Answer 12 · answered Mar 28 '16 at 15:54

I'm going to answer this based on your assumption that robust code will benefit you "years" from now. If long-term benefits are your goal, I would prioritize design and maintainability over robustness.

The trade-off between design and robustness, is time and focus. Most developers would rather have a set of well designed code even if it means going through some trouble spots and doing some additional condition or error handling. After a few years of use, the places you really need it have probably been identified by users.

Assuming the design is about the same quality, less code is easier to maintain. This doesn't mean we're better off if you let known problems go for a few years, but adding things you know you don't need makes it difficult. We've all looked at legacy code and found unnecessary parts. You have to have a high-level of confidence changing code that has worked for years.

So if you feel your app is designed as well as can be, is easy to maintain and doesn't have any bugs, find something better to do than add code you don't need. It's the least you can do out of respect for all the other developers working long-hours on pointless features.

JacquesB · Answer 13 · 2016-03-30T08:59:31.637

No you should not. And you are actually answering your own question when you state that this way of coding may hide bugs. This will not make the code more robust - rather it will make it more prone to bugs and make debugging harder.

You state your current expectations about the rows argument: Either it is null or otherwise it has at least one item. So the question is: Is it a good idea to write the code to additionally handle the unexpected third case where rows has zero items?

The answer is no. You should always throw an exception in the case of unexpected input. Consider this: If some other parts of the code breaks the expectation (i.e. the contract) of you method it means there is a bug. If there is a bug you want to know it as early as possible so you can fix it, and an exception will help you do that.

What the code currently does is guessing how to recover from a bug which may or may not exist in the code. But even if there is a bug, you cant know how to fully recover from it. Bugs by definition have unknown consequences. Maybe some initialization code did not run as expected at it may have many other consequences than just the missing row.

So you code should look like this:

public static CalendarRow AssignAppointmentToRow(Appointment app, List<CalendarRow> rows)
{
    if (rows != null && rows.Count == 0) throw new ArgumentException("Rows is empty."); 

    //1. Is rows equal to null? - This will be the case if this is the first appointment.
    if (rows == null) {
        rows = new List<CalendarRow> ();
        rows.Add (new CalendarRow (0));
        rows [0].Appointments.Add (app);
    }

    //blah...
}

Note: There are some specific cases where it makes sense to "guess" how to handle invalid input rather than just throwing an exception. For example if you handle external input you have no control over. Web browsers are an infamous example because they try to gracefully handle any kind of malformed and invalid input. But this only makes sense with external input, not with calls from other parts of the program.

Edit: Some other answers state that you are doing defensive programming. I disagree. Defensive programming means you don't automatically trust input to be valid. So validation of parameters (as above) is a defensive programming technique, but it does not mean you should alter unexpected or invalid input by guessing. The robust defensive approach is to validate input and then throw an exception in case of unexpected or invalid input.

score 1 · Answer 14 · answered Apr 01 '16 at 10:58

Should I add redundant code now just in case it may be needed in the future?

You should not add redundant code at any time.

You should not add code that is only needed in the future.

You should make sure that your code behaves well no matter what happens.

The definition of "behave well" is up to your needs. One technique I like to use is "paranoia" exceptions. If I am 100% sure that a certain case can never happen, I still program an exception, but I do it in a way that a) clearly tells everyone that I never expect this to happen, and b) is clearly displayed and logged and thus does not lead to creeping corruption later on.

Example pseudo-code:

file = File.open(">", "bla")  or raise "Paranoia: cannot open file 'bla'"

file.write("xytz") or raise "Paranoia: disk full?"

file.close()  or raise "Paranoia: huh?!?!?"

This clearly communicates that I am 100% sure that I can always open, write or close the file, i.e., I do not go to the extents of creating an elaborate error handling. But if (no: when) I cannot open the file, my program will still fail in a controlled way.

The user interface will of course not display such messages to the user, they will be logged internally together with the stack trace. Again, these are internal "Paranoia" exceptions which just make sure that the code "stops" when something unexpected happens. This example is a bit contrieved, in practice I would of course implement real error handling for errors while opening files, as this happens regularly (wrong file name, USB stick mounted read-only, whatever).

A very important related search term would be "fail fast", as noted in other answers and is a very useful to create robust software.

clime · Answer 15 · 2016-04-07T10:31:58.760

There are lots of over-complicated answers here. You probably asked this question cause you didn't feel right about that piece code but wasn't sure why or how to fix it. So my answer is that the problem is very likely in the code structure (as always).

First, the method header:

public static CalendarRow AssignAppointmentToRow(Appointment app, List<CalendarRow> rows)

Assign appointment to what row? That should be clear immediately from the parameter list. Without any further knowledge, I would expect the method params to look like this: (Appointment app, CalendarRow row).

Next, the "input checks":

//1. Is rows equal to null? - This will be the case if this is the first appointment.
if (rows == null) {
    rows = new List<CalendarRow> ();
}

//2. Is rows empty? - This will be the case if this is the first appointment / some other unknown reason.
if(rows.Count == 0)
{
    rows.Add (new CalendarRow (0));
    rows [0].Appointments.Add (app);
}

This is bullshit.

check) The method caller should just make sure it does not pass uninitialized values inside the method. It's a programmer's responsiblity (to try) not to be stupid.
check) If I do not take into account that passing rows into the method is probably wrong (see the comment above), then it should not be responsibility of the method called AssignAppointmentToRow to manipulate rows in any other way than assigning the appointment somewhere.

But the whole concept of assigning appointments somewhere is strange (unless this is a GUI part of the code). Your code seems to contain (or at least it attempts to) an explicit data structure representing a calendar (that is List<CalendarRows> <- that should be defined as Calendar somewhere if you want to go this way, then you would be passing Calendar calendar to your method). If you go this way, I would expect the calendar to be prefilled with slots where you place (assign) the appointments afterwards (e.g. calendar[month][day] = appointment would be the appropriate code). But then you can also choose to ditch the calendar structure from the main logic altogether and just have List<Appointment> where the Appointment objects contain attribute date. And then if you need to render a calendar somewhere in the GUI, you can create this 'explicit calendar' structure just before rendering.

I don't know the details of your app so some of this is perhaps not applicable to you but both of these checks (mainly the second one) tell me that something somewhere is wrong with separation of concerns in your code.

Vi0 · Answer 16 · 2016-03-29T14:14:38.450

For simplicity, let's assume that either you will eventually need this piece of code in N days (not later or earlier) or not need at all.

Pseudocode:

let C_now   = cost of implementing the piece of code now
let C_later = ... later
let C_maint = cost of maintaining the piece of code one day
              (note that it can be negative)
let P_need  = probability that you will need the code after N days

if C_now + C_maint * N < P_need*C_later then implement the code else omit it.

Factors for C_maint:

Does it improve the code in general, making it more self-documenting, easier to test? If yes, negative C_maint expected
Does it make code bigger (hence harder to read, longer to compile, implement tests, etc)?
Are any refactorings/redesigns pending? If yes, bump C_maint. This case requires more complex formula, with more times variables like N.

So big thing that just weights down the code and might be needed only in 2 years with low probability should be left out, but a little thing that also prodices useful assertions and 50% that it will be needed in 3 months, it should be implemented.

You also need to factor in the cost of bugs which might creep into the program because you are not rejecting invalid input. So how do you estimate the cost of potential hard-to-find bugs? — JacquesB, Mar 30 '16 at 08:23

Should I add redundant code now just in case it may be needed in the future?

16 Answers16