23

Back in the 2000s a colleague of mine told me that it is an anti-pattern to make public methods virtual or abstract.

For example, he considered a class like this not well designed:

public abstract class PublicAbstractOrVirtual
{
  public abstract void Method1(string argument);

  public virtual void Method2(string argument)
  {
    if (argument == null) throw new ArgumentNullException(nameof(argument));
    // default implementation
  }
}

He stated that

  • the developer of a derived class that implements Method1 and overrides Method2 has to repeat the argument validation.
  • in case the developer of the base class decides to add something around the customizable part of Method1 or Method2 later, he cannot do it.

Instead my colleague proposed this approach:

public abstract class ProtectedAbstractOrVirtual
{
  public void Method1(string argument)
  {
    if (argument == null) throw new ArgumentNullException(nameof(argument));
    this.Method1Core(argument);
  }

  public void Method2(string argument)
  {
    if (argument == null) throw new ArgumentNullException(nameof(argument));
    this.Method2Core(argument);
  }

  protected abstract void Method1Core(string argument);

  protected virtual void Method2Core(string argument)
  {
    // default implementation
  }
}

He told me making public methods (or properties) virtual or abstract is as bad as making fields public. By wrapping fields into properties one can intercept any access to that fields later, if needed. The same applies to public virtual/abstract members: wrapping them the way as shown in the ProtectedAbstractOrVirtual class allows the base class developer to intercept any calls that go to the virtual/abstract methods.

But I don't see this as a design guideline. Even Microsoft doesn't follow it: just have a look at the Stream class to verify this.

What do you think of that guidline? Does it make any sense, or do you think it's overcomplicating the API?

Peter Perot
  • 389
  • 2
  • 7
  • 5
    Making methods `virtual` *allows optional overriding.* Your method should probably be public, because it might not be overridden. Making methods `abstract` *forces you to override them;* they should probably be `protected`, because they're not particularly useful in a `public` context. – Robert Harvey Sep 04 '19 at 14:50
  • 4
    Actually, `protected` is most useful when you want to expose private members of the abstract class to derived classes. In any case, I'm not particularly concerned about your friend's opinion; choose the access modifier that makes the most sense for your particular situation. – Robert Harvey Sep 04 '19 at 14:53
  • 4
    Your colleague was advocating for the [Template Method Pattern](https://en.wikipedia.org/wiki/Template_method_pattern). There can be use cases for both ways, depending on how interdependent the two methods are. – Greg Burghardt Sep 04 '19 at 15:42
  • 8
    @GregBurghardt: sounds like the OP's colleague suggests *always* to use the template method pattern, regardless if its required or not. That's a typical overuse of patterns - if one has a hammer, sooner or later every problem starts to look like a nail ;-) – Doc Brown Sep 04 '19 at 16:16
  • @DocBrown: True, but on the other hand, the TMP is one of the less ridiculous patterns in the GoF book (ignoring those that are now language features, such as Iterator). – Kevin Sep 05 '19 at 06:07
  • 1
    It sounds like your friend doesn't like `base.Method2(argument);` for some reason, which is what any override should be doing anyway unless they have a very good reason not to. Mind, keeping tight control of a class' public interface is a good idea, but beyond that, I don't see any difference between the two samples you've provided. – Luaan Sep 05 '19 at 07:57
  • 1
    @Luaan ad `base.Method2(argument)`: It depends. When the API docs say you are allowed to completely replace the functionality of the base class method, you can/should omit the base call. – Peter Perot Sep 05 '19 at 10:02
  • 2
    @DocBrown ad _overuse of patterns_: Do you remember those days when C# did not have auto-properties? We had to use backing fields for every single property and write those verbose getter-setter wrappers. When we needed simple DTOs, just plain property containers with no logic at all, everyone told us: _do not expose fields publicly! Wrap them into properties!_ And then some folks argued: _C'mon, it's simple DTOs, we will never intercept access of those public fields._ Again, the property advocates came around and said: _Now you say you won't intercept them. Maybe next year you need to do so._ – Peter Perot Sep 05 '19 at 10:14
  • 3
    @PeterPerot: I had never a problem to start with simple DTOs with just public fields, and when such a DTO turned out to require members with business logic, refactor them to classes with properties. Surely things are different when one works as a library vendor and has to care for not changing public APIs, then even turning a public field into a public property of the same name can cause issues. – Doc Brown Sep 05 '19 at 10:20
  • ... cont'd: What I want to say: I think my colleague had just the same strictness as the advocates who say you'd never ever expose fields publicly. He wanted to keep a backdoor open for the case one needs to intercept the call to the virtual method later - and even if it's just for logging purposes. But the answers here show me that there is a case for not letting that back door open: if you want to allow the derived class to completely replace the original functionality, so that any interception is skipped. – Peter Perot Sep 05 '19 at 10:20
  • 3
    @PeterPerot: yep, and when it turns out you require the "backdoor" later, it is often better to refactor then when its time, instead by polluting the code in advance with too many "just in case" constructs. – Doc Brown Sep 05 '19 at 10:24
  • 1
    @PeterPerot Good points on "leaving backdoors open", I'd like to add that when I see a property in an interface from a third party it takes more thought to deal with it; I may cache the value in performance-critical scenarios, or if I'm worried the value will change unexpectedly. In some cases properties absolutely make sense (especially for libraries). IMO, if you're just handling a bundle of data (e.g., about the result of something or the position of something), please use (readonly) fields, I don't want to have to worry about you re-running a very long operation to give me a *new* result. – jrh Sep 05 '19 at 14:33
  • @RobertHarvey FWIW, one can consider an interface to be a purely abstract class with every member being public. In that context, having public abstract method *does* make a completely valid use case. –  Sep 05 '19 at 15:22
  • 1
    I hated the "always make getters and setters" convention too, but I wasn't authoring popular third-party libraries, or even working in a large organization. Context really matters here. Writing future-proof code is very hard and messy, and I don't think everyone should be trying to do it. – Aleksandr Dubinsky Sep 05 '19 at 18:11
  • 1
    Your colleague's point seems to be founded in his personal approach of "one way to rule them all" as opposed to "sometimes A, sometimes B, see what's appropriate". This isn't meant negatively, but your colleague seems to oversimplify the situation to a single panacea instead of analyzing when public virtuals are and aren't problematic. He's getting into the territory of dogmatic programming. – Flater Mar 01 '21 at 12:01

4 Answers4

31

Saying

that it is an anti-pattern to make public methods virtual or abstract because of the developer of a derived class that implements Method1 and overrides Method2 has to repeat the argument validation

is mixing up cause and effect. It makes the assumption that every overrideable method requires a non-customizable argument validation. But it is quite the other way round:

If one wants to design a method in a way it provides some fixed argument validations in all derivations of the class (or - more general - a customizable and a non-customizable part), then it makes sense to make the entry point non-virtual, and instead provide a virtual or abstract method for the customizable part which is called internally.

But there are lots of examples where it makes perfect sense to have a public virtual method, since there is no fixed non-customizable part: look at standard methods like ToString or Equals or GetHashCode- would it improve the design of the object class to have these not public and virtual at the same time? I don't think so.

Or, in terms of your own code: when the code in the base class finally and intentionally looks like this

 public void Method1(string argument)
 {
    // nothing to validate here, all strings including null allowed
    this.Method1Core(argument);
 }

having this separation between Method1 and Method1Core only complicates things for no apparent reason.

Doc Brown
  • 199,015
  • 33
  • 367
  • 565
  • 1
    In case of the `ToString()` method, Microsoft had better made it non-virtual and introduced a virtual template method `ToStringCore()`. Why: because of this: [`ToString()` - note to inheritors](https://docs.microsoft.com/en-us/dotnet/api/system.object.tostring#notes-to-inheritors). They state that `ToString()` should not return null. They could have coerced this demand by implementing `ToString() => ToStringCore() ?? string.Empty`. – Peter Perot Sep 04 '19 at 14:33
  • 9
    @PeterPerot: that guideline you linked to also recommends not to return `string.Empty`, did you notice? And it recommends a lot of other things which cannot be enforced in code by introducing something like a `ToStringCore` method. So this technique is probably not the right tool for `ToString`. – Doc Brown Sep 04 '19 at 14:41
  • You are right. But I saw a lot of code like `anyObjectIGot.ToString()` instead of `anyObjectIGot?.ToString()`, so at least ensuring a null cannot bubble out of `ToString()` could have been helpful in preventing `NullReferenceException`s. And of course, this could have another drawback known as _exception hiding_: namely in cases when returning null from `ToString()` is really considered a bug. So I think I understand your point: not to coerce something that cannot be coerced with the means of the programming language. – Peter Perot Sep 04 '19 at 14:53
  • @PeterPerot I have been thinking it would have been better to have a `IStringer` that can be implemented differently for different locales, and a `IHasToString` that has the `ToString` method, and a default `Stringer` implementation of `IStringer` that uses it. Basically the pattern of `IEqualityComparer`. Then the default implementation can output the type name for types that do not implement `IHasToString`, and that way `object` would not need `ToString` at all. In fact `Equals` does not need to be there either. `GetHashCode`? not sure. Doc Brown, do you have examples from a different type? – Theraot Sep 04 '19 at 15:10
  • 3
    @Theraot: surely one can find some reasons or arguments why ToString and Equals or GetHashcode could have been designed differently, but today they are as they are (at least I think their design is good enough for make a good example). – Doc Brown Sep 04 '19 at 15:35
  • 1
    @PeterPerot "I saw a lot of code like `anyObjectIGot.ToString()` instead of `anyObjectIGot?.ToString()`" - how is this relevant? Your `ToStringCore()` approach would prevent a null string being returned, but it still would throw a `NullReferenceException` if the object is null. – IMil Sep 04 '19 at 23:34
  • @IMil You are right, and my example was wrong. What I actually wanted to say: I saw a lot of code like `anyObjectIGotNotBeingNull.ToString().Trim()` that would throw a `NullReferenceException` if `ToString()` was returning null. – Peter Perot Sep 05 '19 at 09:21
  • @Theraot I think Microsoft wanted to keep things simple at the base and not overload it with to much patterns.`Equals`, `GetHashCode` and friends are easy to use, even for beginners. – Peter Perot Sep 05 '19 at 10:24
  • @PeterPerot .NET 1.0 did not have generics.`IEqualityComparer` was not possible. Also C# has more kinds of equality than most languages. See [Some objects are more equal than others](https://accu.org/content/conf2011/Steve-Love-Roger-Orr-equals.pdf) (page 12). By the way, I agree with Doc Brown argument. In particular "it makes perfectly sense to have a public virtual method, since there is no fixed non-customizable part", I was arguing for other examples. In fact, after the fact, I found some `public virtual` in `SignInManager`. Examples not necesary for the argument, yet may help. – Theraot Sep 05 '19 at 12:12
  • @Theraot Yeah, Microsoft uses _public virtual/abstract_ a lot, and if they do (as a vendor of large frameworks for a large audience) I think it's okay to **not** wrap every virtual/abstract method into the template method pattern. As _DocBrown_ stated: If there is no fixed _non-customizable_ part (and there is a high probability that there will never be one, even in later versions), then don't provide one - even not an "empty" one. – Peter Perot Sep 05 '19 at 12:22
  • 1
    @PeterPerot I was not trying to convey an argument from authority. It is not so much that Microsoft uses `public virtual`, but that there are cases in which `public virtual` is ok. We could argue for an empty *non-customizable* part as making the code future proof... But that does not work. Going back and changing it could break derived types. Thus, it earns nothing. – Theraot Sep 05 '19 at 12:34
  • @Theraot Replace _Microsoft_ with _large frameworks with large audience producing company_. And yeah, changing a no-op fixed part later and adding functionality to it in the base class could break derived types **if not** done carefully. In fact, I think this is a very good argument! – Peter Perot Sep 05 '19 at 12:54
6

Doing it the way your colleague suggests does provide more flexibility to the implementor of the base class. But with it also comes more complexity which is typically not justified by the presumed benefits.

Mind that the increased flexibility for the base class implementer comes at the expense of less flexibility to the overriding party. They get some imposed behavior that they may not particularly care for. To them things have just gotten more rigid. This can be justified and helpful but this all depends on the scenario.

The naming convention to implement this (that I know of) is to reserve the good name for the public interface and to prefix the internal method's name with "Do".

One useful case is when the action performed needs some setting up and some closing down. Like open a stream and close it after the overrider is done. In general, same sort of initialization and finalization. It is a valid pattern to use but it would be pointless to mandate its use in all abstract and virtual scenarios.

Martin Maat
  • 18,218
  • 3
  • 30
  • 57
  • 1
    The _Do_ method prefix is one option. Microsoft often uses the _Core_ method postfix. – Peter Perot Sep 05 '19 at 11:25
  • @Peter Perot. I never saw the Core prefix in any Microsoft material but this may be because I have not been paying much attention lately. I suspect they started doing this recently just to promote the Core moniker, to make a name for .NET Core. – Martin Maat Sep 05 '19 at 13:24
  • Nope, it's an old hat: [`BindingList`](https://docs.microsoft.com/en-us/dotnet/api/system.componentmodel.bindinglist-1#methods). In addition I found the recommendation somewhere, maybe their _Framework Design Guidelines_ or something similar. And: it's a **postfix**. ;-) – Peter Perot Sep 05 '19 at 13:36
  • Less flexibility for the derived class is the point. The base class is an abstraction boundary. The base class tells the consumer what its public API does, and it defines the API is requires in order to accomplish those goals. If a derived class can override a public method in the base class, you run an increased risk of violating the Liskov Substitution Principle. – Adrian McCarthy Sep 05 '19 at 20:16
3

In C++, this is called the non-virtual interface pattern (NVI). (Once upon a time, it was called the Template Method. That was confusing, but some of the older articles have that terminology.) NVI is promoted by Herb Sutter, who has written about it at least a few times. I think one of the earliest is here.

If I recall correctly, the premise is that a derived class should not change what the base class does but how it does it.

For example, a Shape might have a Move method to relocate the shape. A concrete implementations (e.g., like Square and Circles) shouldn't directly override Move, because Shape defines what Moving means (at a conceptual level). A Square might have difference implementation details than a Circle in terms of how the position is represented internally, so they will have to override some method to provide the Move functionality.

In simple examples, this often boils down to a public Move that just delegates all the work to a private virtual ReallyDoTheMove, so it seems like a lot of overhead for no benefit.

But this one-to-one correspondence is not a requirement. For example, you could add an Animate method to Shape's public API, and it might implement that by calling ReallyDoTheMove in a loop. You end up with two public non-virtual methods APIs that both rely on the one private abstract method. Your Circles and Squares don't need to do any extra work, nor can the override Animate.

The base class defines the public interface used by consumers, and it defines an interface of primitive operations that it needs to implement those public methods. The derived types are responsible for providing implementations of those primitive operations.

I'm not aware of any differences between C# and C++ that would change this aspect of class design.

Adrian McCarthy
  • 981
  • 5
  • 8
  • Good find! Now I remember that I found exactly that post your 2nd link points to (or a copy of it), back in the 2000s. I remember that I was searching for more evidence of my colleague's claim, and that I didn't find anything in the C# context, but C++. This. Is. It! :-) But, back in C# land, it seems this pattern is not used much. Maybe people realized that adding base functionality later can break derived classes, too, and that the strict use of the TMP or NVIP instead of public virtual methods does not always make sense. – Peter Perot Sep 05 '19 at 21:06
  • Note to self: good to know this pattern has a name: NVIP. – Peter Perot Sep 05 '19 at 21:09
2

tl;dr Argument-validation is a run-time check for the part of the method-signature that isn't checked at compile-time. Prohibiting public abstract/virtual methods is a pattern for ensuring that the exact same code-validation is used, consistent with the requirement that the exact same compile-time checks apply to the same signature-group.


The rationale for not exposing abstract/virtual methods is that methods should always have the same signature, both formally and informally.

For example:

class A
{
    public virtual void Print(string? text)
    {
        if (String.IsNullOrEmpty(text)) { throw new InvalidArgumentException(); }
        Console.WriteLine(text);
    }
}
class B1 : A
{
    public override void Print(string? text)
    {
        if (String.IsNullOrEmpty(text)) { throw new InvalidArgumentException(); }
        Console.WriteLine(text);
    }
}
class B2 : A
{
    public override void Print(string? text)
    {
        if (text == null) { throw new NullArgumentException(); }
        if (text.Equals(String.Empty)) { throw new InvalidArgumentException(); }
        Console.WriteLine(text);
    }
}
class B3 : A
{
    public override void Print(string? text)
    {
        Console.WriteLine(text ?? "[null]");
    }
}

In the code above, string? text can't be null or empty (""). Anyone calling .Print() should understand, by contract, that they're actually calling

public virtual void Print([non-null, non-empty]string text)

, leaving three possibilities for inheritors:

  1. Correctly duplicate the exact same code for validation.

    • Example: B1.Print() duplicates A.Print()'s validation code.
  2. Correctly reproduce the same effective-signature, but with inconsistent behavior.

    • Example: B2.Print() has the same effective-signature as A.Print(), but it has inconsistent behavior (returning a different type of exception for a null-argument).
  3. Incorrectly change the effective-signature in addition to inconsistent behavior.

    • Example: B3.Print() accepts null/empty strings.

For example, say someone's trying to debug their code:

var list = new List<A>();
list.Add(new A());
list.Add(new B1());
list.Add(new B2());
list.Add(new B3());

foreach (A item in list)
{
    item.Print(null);
}

, where they improperly call .Print(null) four times. It's the exact same error, but:

  1. They'll get 2 InvalidArgumentException's (from A.Print() and B1.Print()).

  2. They'll get 1 NullArgumentException (from B2.Print()).

  3. They'll not get an exception for the fourth mistake, which'll fail silently.

  4. They'll also get an effective Print("[null]") despite never having called Print("[null]") and having no reason to expect the method to produce this behavior from its code-contract. (Yeah, it's not too hard to figure out what happened in this simple example, but in larger code projects, tracking down stuff like this can be fun!)


Summary of problems.

In short, the problems can include: code-duplication, inconsistent debug-feedback, silent-failures, and unexpected-behavior.

Ultimately inheritors have two options:

  1. Copy/paste the validation code from the base-method in every override (whether directly or through memorizing it and retyping it). Require anyone changing any of them to copy/paste the new version everywhere else too.

  2. Have the code-base polluted with inconsistent debug-feedback, silent errors, and inconsistent behaviors.

Both options make headaches for maintenance. Option (1) requires maintainers to be sure that they copy/paste any modifications into each virtual/override in the same family while Option (2) can result in all sorts of weird problems to untangle.


Better pattern.

class A
{
    public void Print(string? text)
    {
        if (String.IsNullOrEmpty(text)) { throw new InvalidArgumentException(); }
        this.Internal_Print(text);
    }
    protected virtual void Internal_Print(string? text)
    {
        Console.WriteLine(text);
    }
}
class B : A
{
    protected override void Internal_Print(string? text)
    {
        Console.WriteLine(text);
    }
}

This pattern prevents both the need and ability of inheritors to change the parameter-validation. Now we don't have to deal with copy/pasting validation code and the inconsistent/unexpected behaviors that can come from doing anything other than copy/pasting validation code.


Discussion: Conceptual argument for locking the implicit part of a signature.

A method-signature effectively has two parts:

  1. An explicitly specified part, with compile-time validation.

  2. An implicitly specified part, with run-time validation.

The language forces inheritors to maintain consistency with the explicit part, but the implicit part isn't fully captured by the type-system.

So, coders can either:

  1. maintain consistency by copy/pasting the same validation code everywhere;

  2. maintain consistency by requiring methods to create correct types that precisely describe the expected argument space (e.g., using non-null types when null isn't allowed);

  3. break consistency by doing their own thing in each method;

  4. maintain consistency through a combination of (1) and (2) using the above pattern.


Note: Historical context of string? vs. string.

The above code uses string? for consistency with modern C#. However, before C# fixed null-reference typing ambiguities, this same code would've been written as

public virtual void Print(string text)

, because the nullable-qualifier, ?, used to be implicitly assumed on all reference-types.

So if you're confused as to why we'd make the argument nullable only to then protest nullability, that's why: this is meant to reflect a historically common pattern, but written in more recent language.

Nat
  • 1,063
  • 1
  • 8
  • 11