How defensive should we be?

Question

We've been running Pex over some code, and it has been showing some good things (well bad things, but showing them before it gets to production!).

However, one of the nice things about Pex is that it doesn't necessarily stop trying to find issues.

One area we found is that when passing in a string, we were not checking for empty strings.

So we changed:

if (inputString == null)

to

if (string.IsNullOrEmpty(inputString)) // ***

That fixed the initial issues. But then, when we ran Pex again, it decided that:

inputString = "\0";

was causing problems. And then

inputString = "\u0001";

What we've decided is that defaults can be used if we encounter // *** and that we are happy seeing the exception caused by any other odd input (and dealing with it).

Is that enough?

Have you checked to see if you can turn off some of the warnings? I know JTest would do that too, but it was also an option to turn off some of its recommendations. It took some time with tweaking the settings to get a code test profile that we liked. — FrustratedWithFormsDesigner, Jun 17 '11 at 14:59
@Frustrated: Yes, there are certainly tweaks to the profile we can do and are doing. Pex is a great tool, by the way. The results it showed just prompted the question. — Peter K., Jun 17 '11 at 15:12
I don't have an answer for you, but I wanted to say thanks for that link to this Pex thing; it looks pretty neat if it's what I think it is and will automatically (?) create unit tests for existing code. The code I'm working on is huge and has no tests and very tightly-coupled code so it's impossible to refactor anything. — Wayne Molina, Jun 17 '11 at 15:25
@WayneM : Yes, it's pretty good, though I needed to go through a few examples to understand what it was doing. For legacy code, it's great. It's even better for new code! — Peter K., Jun 17 '11 at 17:51

score 9 · Accepted Answer · answered Jun 17 '11 at 15:04

Three questions should help you determine how defensive to be in your coding.

First - What are the consequences of bad input getting through? If it is an error message on one of your developer's PC's, maybe not so critical to be defensive. Could it result in financial disruptions at clients, IE accounting information disruption? Is it a real time system where lives are at risk? In the life / death scenario, there should probably be more validation and error handling code than actual feature code.

Secondly, how many people will re-use this function or piece of code? Just you? Your department? Your company? Your customers? The wider the use of the code the more defensive.

Third - what is the source of the input I am validating? If it is input from the user or from a public web site, I would be super defensive. If the input is always coming from your code, be somewhat defensive, but don't spend undue time putting in checks.

It will always be possible to add in more error checking and validation in a system. The point is will the cost of writing and maintaining this code outweigh the cost of the problems caused by errors in the code.

score 6 · Answer 2 · answered Jun 17 '11 at 15:00

Users are evil, and anything they input should be checked with the utmost rigor.

Anything generated without the benefit of user input, or from pre-sanitized data, shouldn't need to be checked at the same level. The problem here is when you forget and use these methods on bad data, because you've forgotten that the code wasn't hardened.

The one thing you should always check is anything that might cause an overflow or a crash. Doesn't matter how deeply that method is buried, and how sure you are that that condition could never occur. You need to program for it anyway, just to placate Murphy.

score 2 · Answer 3 · answered Jun 17 '11 at 15:39

I would be as defensive as you need to be. A bit ambiguous, I guess so but I will try to explain.

When you right a method, if that method has input parameters you have to make the decision on what you expect those parameters to be. In situations and places within an application this will differ. For example, if a method or piece of code is accepting data from a user input then you would want to cover all code basis and handle any input accordingly whether via an error message or some nice way of displaying unacceptable data.

If the method is an exposed API ditto. You can't control what is coming in so you should expect to try and cover all aspects and program accordingly.

For methods that are produced within the core engine of your project, here you have a decision to make. Do I assume the data that is arriving has been pre-screened and validated before it arrives or should I put in the necessary checks. I guess this depends on the conceptual level of the method and if this is an acceptable place to check. So things I might consider is:

1) Is this the only place I will need to do this check? Will this variable need to be checked in lots of different places for this condition. If so can I do the check once higher up the chain and then assume validity afterwards

2) Are other components of the system expected to work with the methods and interfaces I write. If so can you control through debug assert statements, debugging exceptions, method commenting and general system architecture the outcome you require, or will the data need checks put in place.

3) What are the outcomes of failure at this point in the code. Will it cause the entire thing to fail? Will any error be caught elsewhere and will that error at least be caught.

4) Does it make sense to put a check here? Sometimes putting a check on a piece of possible corrupt data although helping to solve the issue at that point and not error out might help cover it up. At which point you might spend hours chasing down a different issue only to find the actual issue was because of a valid check on data whay back in the chain of events that cascaded to the one the user/developer was reported.

In general I am a defensive programmer however I also believe that with thorough TDD and appropiate unit testing that you can put in the checks in code at the required levels and be confident that it is working as it should once it gets to any lower level sections.

score 1 · Answer 4 · answered Jun 17 '11 at 16:21

I've spent weeks before on bugs that could have been detected with 5 minutes of extra work like this up front. In my mind this up front work is always worth it. You will be fixing the bug eventually. The only question is how long it will take you.

One thing these sorts of analysis tools often uncover is things that are not necessarily bugs, but are bad programming habits that make bugs more likely to occur. One such common habit is variable initialization. Sometimes an empty string is a perfectly valid value for a variable. In that case, you want to configure your tool to not consider that an error in that particular instance. However, often an empty string is not a valid value for a variable, but people are setting it anyway, because the compiler complains if there isn't something in there, or your language automatically initializes to the empty string whether that's valid or not.

This frustrates people because it seems like a catch 22 with no valid solution, but the solution is to refactor your code so it doesn't need to initialize the variable until there is a valid value to put in there. That requires some extra thought up front, but makes your code much more robust, and is actually easier to write and maintain in the long term.

How defensive should we be?

4 Answers4

Linked