70

So I don't know if this is good or bad code design so I thought I better ask.

I frequently create methods that do data processing involving classes and I often do a lot of checks in the methods to make sure I don't get null references or other errors before hand.

For a very basic example:

// fields and properties
private Entity _someEntity;
public Entity SomeEntity => _someEntity;

public void AssignEntity(Entity entity){
    _someEntity = entity;
}
public void SetName(string name)
{
    if (_someEntity == null) return; //check to avoid null ref
    _someEntity.Name = name;
    label.SetText(_someEntity.Name);
}

So as you can see im checking for null every time. But should the method not have this check?

For example should external code cleanse the data before hand so the methods don't need to validate like below:

 if(entity != null) // this makes the null checks redundant in the methods
 {
     Manager.AssignEntity(entity);
     Manager.SetName("Test");
 }

In summary, should methods be "data validating" and then do their processing on the data, or should it be guaranteed before you call the method, and if you fail to validate prior to calling the method it should throw an error (or catch the error)?

Toon Krijthe
  • 2,634
  • 1
  • 21
  • 30
WDUK
  • 2,032
  • 3
  • 13
  • 23
  • 7
    What happens when someone fails to do the null check and SetName throws an exception anyway? – Robert Harvey Jul 02 '18 at 20:58
  • 5
    What happens if I actually _want_ someEntity to be Null? You decide what you want, either someEntity can be Null, or it can't, and then you write your code accordingly. If you want it never to be Null, you can replace checks with asserts. – gnasher729 Jul 02 '18 at 21:04
  • Note that some languages do have the option to make this parameter contract explicit, e.g. Scala’s Option. While you could still pass a null Option, that would clearly be a mistake of the caller. But an empty Option would be fine. – Sebastiaan van den Broek Jul 03 '18 at 03:37
  • Why is it null in the first place? Is it supposed to be null, or is it not supposed to be null? If someone accidentally passes null, what should happen and why? – user253751 Jul 03 '18 at 05:18
  • 5
    You can watch [Gary Bernhardt's Boundaries](https://www.destroyallsoftware.com/talks/boundaries) talk about making a clear division between sanitizing data (ie, checking for nulls, etc) in an outer shell, and having an inner core system that doesn't have to care about that stuff - and just do its work. – mgarciaisaia Jul 03 '18 at 12:12
  • 2
    With C# 8 you would potentially never need to worry about null reference exceptions with the right settings turned on: https://blogs.msdn.microsoft.com/dotnet/2017/11/15/nullable-reference-types-in-csharp/ – JᴀʏMᴇᴇ Jul 03 '18 at 12:40
  • The answer depends on a few factors. The important question here is, "Is there a valid reason for `_someEntity` to be `null`" (or if you prefer, is a `null` `_someEntity` field a valid state for the object to be in?). My suggestions: If there's no reason for the class to hold a null value for this field: If you can, pass in the entity in the constructor and throw an exception there, if you can't provide it in the constructor, throw an exception in `AssignEntity`... – jrh Jul 03 '18 at 14:16
  • ... if there *is* a valid reason to have a null entity, then the `null` check is reasonable. Now the design decision is in `SetName`, as in, is it **safe** for it to ignore the entity being null and just return? (e.g., because the entity is optional) Or is it better for it to throw an `InvalidOperationException` because you tried to rename something that doesn't exist? This particular design decision depends on whether or not the caller of `SetName` knows / cares the entity exists, or if it's okay / acceptable for it to say "well I tried to rename it" and give up... – jrh Jul 03 '18 at 14:18
  • ... Or the possible unusual third case here, let's say you went with the route where you don't allow `_someEntity` to be not `null` (e.g., in the constructor / using `readonly`), hypothetically if you are doing Fox Moulder programming and you just can't afford any risk at all, and you feel like it's in your best interest to have a recovery option for absolutely everything (even against a future developer not understanding that `_someEntity` can't be `null`), you could do something to indicate that this is an unexpected condition, then perform whatever recovery you set to get you out of it. – jrh Jul 03 '18 at 14:23
  • (the last kind of "Fox Moulder" coding can be costly, I usually reserve it for poorly documented 3rd party libraries, old, difficult to use code with a giant amount of state that can't be unit tested / refactored yet, code that's executed periodically or rapidly, or code for a situation where *some kind of response* is crucial and an invalid state would be better off at least sending an error message to the thing requesting the information than just crashing and closing the socket) – jrh Jul 03 '18 at 14:35
  • 4
    This is one of the reasons why the C# language team wants to introduce nullable reference types (and non nullable ones) -> https://github.com/dotnet/csharplang/issues/36 – Mafii Jul 04 '18 at 07:00
  • The runtime engine checks for null values when they are being used, but you probably don't want that. – Thorbjørn Ravn Andersen Jul 04 '18 at 14:29
  • The question seems specific to C#, and certainly the 'right' way to handle null (whether and which exception to raise, etc.) will be language-dependent. – smci Jul 04 '18 at 19:14

7 Answers7

168

The problem with your basic example isn't the null check, it's the silent fail.

Null pointer/reference errors, more often than not, are programmer errors. Programmer errors are often best dealt with by failing immediately and loudly. You have three general ways to deal with the problem in this situation:

  1. Don't bother checking, and just let the runtime throw the nullpointer exception.
  2. Check, and throw an exception with a message better than the basic NPE message.

A third solution is more work but is much more robust:

  1. Structure your class such that it's virtually impossible for _someEntity to ever be in an invalid state. One way to do that is to get rid of AssignEntity and require it as a parameter for instantiation. Other Dependency Injection techniques can be useful as well.

In some situations, it makes sense to check all function arguments for validity before whatever work you're doing, and in others it make sense to tell the caller they are responsible for ensuring their inputs are valid, and not checking. Which end of the spectrum you're on is going to depend on your problem domain. The third option has a significant benefit in that you can, to an extent, have the compiler enforce that the caller does everything properly.

If the third option is not an option, then in my opinion it really doesn't matter as long as it doesn't silently fail. If not checking for null causes the program to instantly explode, fine, but if it instead quietly corrupts the data to cause trouble down the road, then it's best to check and deal with it right then and there.

whatsisname
  • 27,463
  • 14
  • 73
  • 93
  • Another thing to consider is your intended audience. In Java, for example, you can use annotations (similar to C# attributes) like `@NotNull` on fields, parameters, and return values. If you trust the users of your API to pay attention to compiler warnings, you can rely on the annotations and dispense with the null checks. – StackOverthrow Jul 03 '18 at 18:09
  • Dependency injection is completely unrelated to not `null` constraints. The constructor would still have to verify the input isn't `null`, as `AssignEntity` should if it is preserved as a separate method. – jpmc26 Jul 03 '18 at 22:56
  • 1
    If for some reason null values being passed in was a normal thing, and it would be fine for an assignment to fail, then I suppose it makes sense. Usually, though, you'd want to know when an assignment doesn't happen. I'd expect this behavior in `setNameOrDontImNotACop(string name)` – corsiKa Jul 04 '18 at 00:02
  • 100% disagree with the first two proposals. Your 'public' code should not explode even if/when someone feeds it garbage. Letting the runtime throw or throwing your own exception are both _terrible_ options. You don't know if `null` is actually a wrong/invalid value in the caller's world, and your code shouldn't be forcing assumptions out into the caller's world. If `null` is in fact wrong, then the caller will discover their error when _their_ code crashes. If it isn't, then the caller will just conclude (quite rightly!) that _your_ code is broken because it doesn't handle `null` properly. – aroth Jul 04 '18 at 02:10
  • 12
    @aroth: Any design of anything, software included, will come with constraints. The very act of designing is selecting where those constraints go, and it is not my responsibility nor should it be that I accept any input and be expected to do something useful with it. I do know whether `null` is wrong or invalid because in my hypothetical library I've defined the data types and I've defined what is valid and what isn't. You call that "forcing assumptions" but guess what: that's what every software library in existence does. There are times where null is a valid value, but that's not "always". – whatsisname Jul 04 '18 at 04:26
  • 5
    "that *your code* is broken because it doesn't handle `null` properly" - This is a misunderstanding of what proper handling means. For most Java programmers, it means throwing an exception for *any* invalid input. Using `@ParametersAreNonnullByDefault`, it means also for every `null`, unless the argument is marked as `@Nullable`. Doing anything else means lengthening the path *until it finally blows elsewhere* and thereby also lengthening the time wasted on debugging. Well, by ignoring problems, you may achieve that it never blows, but incorrect behavior is then pretty guaranteed. – maaartinus Jul 04 '18 at 05:17
  • @maaartinus I'm a Java programmer, and it irks me to no end when other Java programmers think that 'throwing an exception for any invalid input' is automatically the right thing to do. Doubly so when they decide that the "right" way to do that is by using checked exceptions (at a rate of one custom `Exception` type per possible thing that could go wrong). In some cases that's reasonable, in other cases it isn't. Though anyways, this is about `null` input, not _invalid_ input. They're different categories of things. – aroth Jul 04 '18 at 05:52
  • 4
    @aroth Dear Lord, I would hate to work with any code you write. Explosions are infinitely better than doing something nonsensical, which is the only other option for invalid input. Exceptions force me, the caller, decide what the right thing to do is in those cases, when the method cannot possibly guess what I intended. Checked exceptions and unnecessarily extending `Exception` are completely unrelated debates. (I agree both are troublesome.) For this specific example, `null` obviously makes no sense here if the purpose of the class is to invoke methods on the object. – jpmc26 Jul 04 '18 at 17:26
  • 3
    "your code shouldn't be forcing assumptions out into the caller's world" - It's impossible to not force assumptions out to the caller. That's why we need to make these assumptions as explicit as possible. – Diogo Kollross Jul 04 '18 at 18:12
  • 3
    @aroth *your* code is broken because its passing *my* code a null when it should be passing something that has Name as a property. Any behaviour other than exploding is some kind of backdoor bizarro-world version of dynamic typing IMHO. – mbrig Jul 04 '18 at 20:02
  • 3
    +1 for "Programmer errors are often best dealt with by failing **immediately** and **loudly**." I'm currently working on a project that has an known number of "silent fails". Every time I improve the code, my new code blows up somewhere because the old code has not stopped bad data getting through. My code blowing up means I now know there's an error and I can fix the old code too! Previously we'd just get mysterious failures that we could never explain because they were too far removed from where the bad values should have been caught. – CJ Dennis Jul 05 '18 at 04:32
  • 1
    @jpmc26: Not totally unrelated. By requiring the object at construction, you can guarantee either the variable is assigned with something useful, or throw, and cause the caller to not have an object. In either case, you have enforced there is only one place where related bugs will be found, instead of sprinkling them throughout, because if you succeed in creating the object, it's known to be in a valid state. – whatsisname Jul 05 '18 at 04:53
  • 2
    @aroth: perhaps you are thinking of 'invalid' as erroneous, such as the user providing a nonexistent file or something. Instead, we're talking about invalid as "programmer did something dumb". A null pointer exception should **never** be encountered in a well designed program. Similar would be index out of bounds, division by zero, etc. Exceptions like those are not errors a program should gracefully handle, but from situations where the programmer didn't do their job right. Stopping immediately and launching into a debugger is the only good course of action for those. – whatsisname Jul 05 '18 at 04:57
  • @whatsisname Dependency injection doesn't require values. Dependency injection just tries to find a suitable class to create an instance for a parameter from a list of registered classes. It can pass `null` in when it's unable to find such a class. It does not verify any value is not null; that has to be done in the class' code, whether you use dependency injection or not. Also, given that the OP seems to be trying to create some kind of binding mechanism, it's likely they'll have many `Entity` instances at runtime, which makes dependency injection unsuitable for the problem. – jpmc26 Jul 05 '18 at 05:46
  • 1
    @jpmc26: DI doesn't have to be complicated frameworks, object graphs, runtime configuration and whatnot. For many, requiring dependencies in the constructor is all that's needed for simple and effective DI. – whatsisname Jul 05 '18 at 15:44
  • 1
    @whatsisname I think you're confusing dependency injection and inversion of control. Inversion of control still doesn't prevent callers from passing `null` inputs. It's a design principle about moving functionality that could potentially vary from one class to another. (As a side note, it is in my opinion a poor principle, as it often leads developers to object hierarchy hell, making their code vastly more difficult to understand.) That isn't applicable here, either. There's no factoring out of functionality here. Validating arguments is not a DI or IOC principle; it's just common sense. – jpmc26 Jul 05 '18 at 17:00
55

Since _someEntity can be modified at any stage, then it makes sense to test it every time that SetName is called. After all, it could have changed since the last time that method was called. But be aware that the code in SetName isn't thread-safe, so you can perform that check in one thread, have _someEntity set to null by another and then the code will barf a NullReferenceException anyway.

One approach to this problem therefore is to get even more defensive and do one of the following:

  1. make a local copy of _someEntity and reference that copy through the method,
  2. use a lock around _someEntity to ensure it doesn't change as the method executes,
  3. use a try/catch and perform the same action on any NullReferenceException's that occur within the method as you'd perform in the initial null check (in this case, a nop action).

But what you should really do is stop and ask yourself the question: do you actually need to allow _someEntity to be overwritten? Why not set it once, via the constructor. That way, you need only ever need do that null check the once:

private readonly Entity _someEntity;

public Constructor(Entity someEntity)
{
    if (someEntity == null) throw new ArgumentNullException(nameof(someEntity));
    _someEntity = someEntity;
}

public void SetName(string name)
{
    _someEntity.Name = name;
    label.SetText(_someEntity.Name);
}

If at all possible, this is the approach I'd recommend taking in such situations. If you can't be mindful of thread safety when choosing how to handle possible nulls.

As a bonus comment, I feel your code is strange and could be improved. All of:

private Entity _someEntity;
public Entity SomeEntity => _someEntity;

public void AssignEntity(Entity entity){
    _someEntity = entity;
}

can be replaced with:

public Entity SomeEntity { get; set; }

Which has the same functionality, save for being able to set the field via the property setter, rather than a method with a different name.

David Arno
  • 38,972
  • 9
  • 88
  • 121
  • 5
    Why does this answer the question? The question addresses null checking and this is pretty much a code review. – IEatBagels Jul 03 '18 at 17:05
  • 17
    @TopinFrassi it explains that the current null checking approach is not thread safe. It then touches on other defensive programming techniques to get around that. Then it wraps up with the suggestion of using immutability to avoid the need to have that defensive programming in the first place. And as an added bonus, it offers advice on how to structure the code better too. – David Arno Jul 03 '18 at 17:09
23

But should the method not have this check?

This is your choice.

By creating a public method, you are offering the public the opportunity to call it. This always comes along with an implicit contract on how to call that method and what to expect when doing so. This contract may (or may not) include "yeah, uhhm, if you pass null as a parameter value, it will explode right in your face.". Other parts of that contract might be "oh, BTW: every time two threads execute this method at the same time, a kitten dies".

What type of contract you want to design is up to you. The important things are to

  • create an API that's useful and consistent and

  • to document it well, especially for those edge cases.

What are the likely cases how your method is being called?

Peter Mortensen
  • 1,050
  • 2
  • 12
  • 14
null
  • 3,546
  • 1
  • 11
  • 21
  • Well, thread-safety is the exception, not the rule, when there is concurrent access. Thus use of hidden / global state is documented, as well as any case concurrency is safe anyway. – Deduplicator Jul 05 '18 at 21:56
14

The other answers are good; I'd like to extend them by making some comments about accessibility modifiers. First, what to do if null is never valid:

  • public and protected methods are those where you do not control the caller. They should throw when passed null. That way you train your callers to never pass you a null, because they would like their program to not crash.

  • internal and private methods are those where you do control the caller. They should assert when passed null; they will then probably crash later, but at least you got the assertion first, and an opportunity to break into the debugger. Again, you want to train your callers to call you correctly by making it hurt when they do not.

Now, what do you do if it might be valid for a null to be passed in? I would avoid this situation with the null object pattern. That is: make a special instance of the type and require that it be used any time that an invalid object is needed. Now you are back in the pleasant world of throwing/asserting on every use of null.

For example, you don't want to be in this situation:

class Person { ... }
...
class ExpenseReport { 
  public void SubmitReport(Person approver) { 
    if (approver == null) ... do something ...

and at the call site:

// I guess this works but I don't know what it means
expenses.SubmitReport(null); 

Because (1) you're training your callers that null is a good value, and (2) it is unclear what the semantics of that value are. Rather, do this:

class ExpenseReport { 
  public static readonly Person ApproverNotRequired = new Person();
  public static readonly Person ApproverUnknown = new Person();
  ...
  public void SubmitReport(Person approver) { 
    if (approver == null) throw ...
    if (approver == ApproverNotRequired) ... do something ...
    if (approver == ApproverUnknown) ... do something ...

And now the call site looks like this:

expenses.SubmitReport(ExpenseReport.ApproverNotRequired);

and now the reader of the code knows what it means.

Eric Lippert
  • 45,799
  • 22
  • 87
  • 126
  • Better yet, don’t have a chain of `if`s for the null objects but rather use polymorphism to make them behave correctly. – Konrad Rudolph Jul 04 '18 at 10:29
  • @KonradRudolph: Indeed, that's often a good technique. I like to use that for data structures, where I make an `EmptyQueue` type that throws when dequeued, and so on. – Eric Lippert Jul 04 '18 at 17:37
  • @EricLippert - Forgive me because I'm still learning here, but suppose if the `Person` class was immutable, and it didn't contain a zero argument constructor, how would you construct your `ApproverNotRequired` or `ApproverUnknown` instances? In other words, what values would you pass the constructor? –  Jul 05 '18 at 00:54
  • 2
    I love bumping into your answers/comments across SE, they're always insightful. As soon as I saw asserting for internal/private methods I scrolled down expecting it to be an Eric Lippert answer, wasn't disappointed :) – bob esponja Jul 05 '18 at 10:46
  • @KonradRudolph any tips on how you would implement that? My first thought is by defining ApproverNotRequired as a subclass of Person and delegating the SubmitReport logic to a Person.SubmitReport method, which is overridden as needed. But I'd be worried Person would end up having many unrelated responsibilities. – bob esponja Jul 05 '18 at 11:00
  • 4
    You guys are over-thinking the specific example I gave. It was intended to get the idea of the null object pattern across quickly. It was not intended to be an example of good design in a papers-and-paychecks automation system. In a real system, making "approver" have type "person" is probably a bad idea because it does not match the business process; there might be no approver, you might need several, and so on. As I often note when answering questions in that domain, what we really want is a *policy* object. Don't get hung up on the example. – Eric Lippert Jul 05 '18 at 13:37
8

The other answers point out that your code can be cleaned up to not need a null check where you have it, however, for a general answer on what a useful null check can be for consider the following sample code:

public static class Foo {
    public static void Frob(Bar a, Bar b) {
        if (a == null) throw new ArgumentNullException(nameof(a));
        if (b == null) throw new ArgumentNullException(nameof(b));

        a.Something = b.Something;
    }
}

This gives you an advantage for maintenance. If there ever occurs an defect in your code where something passes a null object to the method you will get useful information from method as to which parameter was null. Without the checks, you will get a NullReferenceException on the line:

a.Something = b.Something;

And (given that the Something property is a value type) either a or b could be null and you will have no way of knowing which from the stack trace. With the checks, you will know exactly which object was null, which can be enormously useful when trying to unpick a defect.

I would note that the pattern in your original code:

if (x == null) return;

Can have its uses in certain circumstances, it can also (possibly more often) give you a very good way of masking underlying issues in your codebase.

Paddy
  • 2,623
  • 16
  • 17
4

No not at all, but it depends.

You only do a null reference check if you want to react on it.

If you do a null reference check and throw a checked exception, you force the user of the method to react on it and allow him to recover. The program still works as expected.

If you don't do a null reference check (or throw a unchecked exception), it might throw a unchecked NullReferenceException, which usually is not handled by the user of the method and even might shut down the application. This means that the function is obviously completely misunderstood and therefore the application is faulty.

Herr Derb
  • 409
  • 1
  • 6
  • 13
4

Something of an extension to @null's answer, but in my experience, do null checks no matter what.

Good defensive programming is the attitude that you code the current routine in isolation, on the assumption that the entire rest of the program is trying to crash it. When you're writing mission critical code where failure is not an option, this is pretty much the only way to go.

Now, how you actually deal with a null value, that depends a great deal on your use case.

If null is an acceptable value, e.g. any of the second through fifth parameters to the MSVC routine _splitpath(), then silently dealing with it is the correct behavior.

If null should not ever happen when calling the routine in question, i.e. in the OP's case the API contract requires AssignEntity() to have been called with a valid Entity then throw an exception, or otherwise fail ensuring you make a lot of noise in the process.

Never worry about the cost of a null check, they're dirt cheap if they do happen, and with modern compilers, static code analysis can and will elide them completely if the compiler determines that it's not going to happen.

dgnuff
  • 189
  • 6
  • [Or avoid it altogether](https://www.youtube.com/watch?v=z43bmaMwagI) (48 mins 29 secs: *"Never ever use or allow anywhere near your program a null pointer"*). – Peter Mortensen Jul 05 '18 at 20:30