7

I got into a debate on this question which distilled to if it is a good idea for a specialization of a class to add business rules. Unfortunately this point got trampled in the comments so I'm asking it again as a separate question.

I believe two things:

  • An object is responsible for its internal consistency
  • A specialization/child class has more specific rules than the super class which can be seen as the general case.

The logical result of this is that a specialization might only accept some values of input for its method or might change some values in order to stay consistent. But isn't that OK, since guarding its internal consistency is what an object should do?

A point many people were making is that some code could break if it would make assumptions. For example that setting the width would not change the height of a square. However wouldn't that be bad code? Since you make assumptions on how the object does something instead of just telling it what to do and not worry about it?

If we would not write code like that almost all overloading would have problems. How often doesn't overloading add an extra fail condition or more internal logic that might be seen via other parts of the interface? Maybe the point an old professor of me once made is correct: "you should only ever use inheritance to overload the constructor". At the time that seemed a bit strict but now it seems like the only way to guarantee these kinds of problem never happening. To use the old square: rectangle analogy again:

public class Rectangle
{
    private int width, height;

    public Rectangle(int width, int height){this.width = width; this.height = height;}

    public void SetWidth... SetHeight...
}

public class Square : Rectangle
{
    public Square(int diameter) : base(diameter, diameter) {}

    public void SetDiameter...
}

Note: I hope we can play this question a little bit less 'on the man' than the question that inspired it. I've been on Stack Exchange for more than three years but I was quite intimidated by the type of responses here.

Roy T.
  • 644
  • 6
  • 14
  • "The most remarkable property of the notion of class is that it subsumes these two concepts, merging them into a single linguistic construct. A class is a module, or unit of software decomposition; but it is also a type (or, in cases involving genericity, a type pattern). Much of the power of the object-oriented method derives from this identification. **Inheritance, in particular, can only be understood fully if we look at it as providing both module extension and type specialization...**" ([OOSC §7.3](http://en.wikipedia.org/wiki/Object-Oriented_Software_Construction)) – gnat May 08 '14 at 12:49
  • 5
    It might be best if you try to understand this in terms of https://en.wikipedia.org/wiki/Liskov_substitution_principle and why that's a very useful property for a subtyping relationship to have. If your subtypes can constrain further than their supertypes, you /cannot/ treat the subtypes in a generic fashion, and you break ad hoc polymorphism. Consider my Nulltangle, which throws exceptions on every operation. I've "specialised" it to the absurd extreme, but now you can't write a method that does anything with rectangle without checking for my Nulltangle first! – Phoshi May 08 '14 at 12:53
  • That NullTangle, though a bit extreme, is actually a good point. – Roy T. May 08 '14 at 13:05
  • 1
    See also: http://programmers.stackexchange.com/questions/199331/is-there-a-specific-name-for-the-square-inherits-from-rectangle-paradox – Bart van Ingen Schenau May 08 '14 at 13:11
  • @BartvanIngenSchenau that is exactly my question. Unfortunately I don't have the rep yet to vote to close myself. But it will happen soon anyway :). Strange that in 6 years BSc and MSc level IT-courses I had never heard the Liskov substitution principle mentioned :(. – Roy T. May 08 '14 at 13:21
  • Nulltangle is intentionally the absurd extreme, but I find those often force the point :P I wouldn't worry about not having heard of the concept (or the rest of SOLID), as I tend to find that what happens in the real world and what happens in academia are quite disparate, and emphasize very different goals. – Phoshi May 08 '14 at 13:24
  • @RoyT. - You may not have heard it in university, but if you can recognize its value and use it appropriately, you were well prepared. New ideas are going to come that could make this obsolete, so you need to be able to determine this for yourself and not just crank-out some dictionary definition or blind adherence. – JeffO May 08 '14 at 15:39
  • @JeffO well at least I realized something was wrong. This page really clears it up now :). – Roy T. May 09 '14 at 08:06

4 Answers4

7

The trap a lot of people fall into is looking at inheritance as a means to codify any relationship or similarity between two classes. That's not the case. Inheritance is useful for certain limited kinds of relationships and is actually harmful when used outside those contexts. Lack of substitutability is one reason why.

The crucial point a lot of people miss about the square-rectangle example is that it is perfectly substitutable if you reverse the relationship. In object-oriented design, a rectangle is a specialized form of a square. The reason that's hard to see is that people want to organize classes by the similarities between the classes themselves, perhaps following real-world taxonomies, when they should really be concerned with organizing classes by what methods the calling code will need to use on a mixed collection of those classes. That's where the idea of substitutability comes in.

Think of it this way. You have a bunch of code that sets the width of squares and you want to throw some rectangles into the mix. You can set the width on either a square or rectangle all day without violating substitutability, but independently setting the height only applies to the rectangle, so it should not be a part of the base class. You're adding to the specialized class, not changing the common behavior.

In other words, don't make it a false choice and say to yourself, "I have no choice but to violate substitutability." If you can't do something without violating substitutability, then you either need to change your inheritance relationship, or not use inheritance at all.

Karl Bielefeldt
  • 146,727
  • 38
  • 279
  • 479
  • 1
    A properly-designed `Square` class wouldn't have a `width`, it would have a single dimension, maybe called `side_length`. Subclassing a `Rectangle` from it would mean that anything that takes a `Square` as an argument would expect to be able to ask for `side_length`. What would a (derived) `Rectangle` return for that if its `side_length` and `height` were different? Better might be to derive both from an abstract `TwoDimensionalFigure` that has a `width` and `height`. In the case of a `Square`, either would return `side_length`; a `Rectangle` would return the `width` and `height`. – Blrfl May 08 '14 at 13:58
  • It could make sense to have a `Square` derive from a `MutableRectangularShape` class *if the contract for that class specifies that `SetHeight`, `SetWidth`, and `SetDimensions` will operate on a "best-effort" basis*. A lot of user-interface designers work that way with controls that can only be set to certain sizes (e.g. integer multiples of the text-line height). In some such cases, the useful thing to do is have read-write properties for "requested size" [which should *always* be independent] and read-only properties for "actual size" [which may or may not match requests]. – supercat May 08 '14 at 15:28
  • Yes, calling context is paramount. There is no single "proper" way to define a model. That was my entire point, not to get hung up on the real-world taxonomies. The simplest model of a square's size is one-dimensional. Whether it's most beneficial to call that dimension width or side length depends on the context. That's beside the point. The physical representation of its size is two-dimensional. Including position and rotation adds three more dimensions. Adding things like color, pattern, z-index, border width, etc. can add several more dimensions. – Karl Bielefeldt May 08 '14 at 15:52
  • Substitutability can be boiled down to the idea that a derived class should only add dimensions, not change or remove them. What dimensions should be in the topmost interface depends on the calling context. – Karl Bielefeldt May 08 '14 at 15:53
  • The implied context of the square-rectangle example is not "draw a bunch of shapes on the screen." It's more like "create a collection of shapes, then find their total area." Needing to draw something on the screen changes the calling context, and therefore the desired interface and inheritance hierarchy. – Karl Bielefeldt May 08 '14 at 15:59
  • 1
    To add to Karl's point, if you aren't going to maintain substitutability, why are you using inheritance at all? You've just lost the ability to leverage polymorphism; all of the promises provided by inheritance (being able to safely use protected members, virtual functions, etc.) are gone. Anything left (e.g., having a glorified property bag) can be accomplished with composition. See also [Uses and Abused of Inheritance](http://gotw.ca/publications/mill06.htm) - It does a decent job codifying exactly what inheritance gets that composition misses. – Brian May 08 '14 at 17:20
3

A point many people were making is that some code could break if it would make assumptions. For example that setting the width would not change the height of a square. However wouldn't that be bad code? Since you make assumptions on how the object does something instead of just telling it what to do and not worry about it?

I'm allowed to assume the object follows its specifications. If the specification for Rectangles says that the width and height are independently modifiable, then any implementation must conform. (If you don't require conformance, it's impossible to reason about your program.) Now, you could argue that the specification for Rectangles never said that setWidth can't change the height, but if you attempt to list all the things something must not do you'll find that the list is infinite:

  • setWidth musn't reformat my hard drive
  • setWidth musn't delete files in my home folder
  • setWidth musn't make changes to the Windows Registry
  • setWidth musn't change another object's state
  • setWidth musn't go into an infinite loop
  • setWidth musn't post to Twitter on my behalf
  • ...

The only sensible way to specify something is to list the things it must and may do and assume anything not listed is forbidden. So if the spec for setWidth says it changes the rectangle's width, I assume it doesn't change the height.

How often doesn't overloading add an extra fail condition...

Doing this will definitely bring you pain and misery. Any program written according to the specifications is assuming a certain operation can only fail because of A, B, and C. If you introduce a new fail condition D, no one can possibly handle it.

A word to the wise, though - if you need this kind of substitutability, inheritance is probably not what you want. You start with some type Foo and then you realize you want a ShinyFoo. Later you want a TransparentFoo. Eventually you'll want a ShinyTransparentFoo and then you'll be in trouble. You don't run into this sort of problem if you use an interface and rely on composition to reuse behavior.

Doval
  • 15,347
  • 3
  • 43
  • 58
  • If the contract specifies that the width and height may be set independently, then legitimate subclasses must obey. If the contract includes a `HeightAndWidthRelationship` property and specifies that classes for which that returns `Independent` must allow the height to be set without changing width and vice versa, classes for which that returns `LockedRatio` must have changes to `Height` affect width and vice versa, `LockedHeight` should allow width to be changed but not height, etc. then legitimate subclasses should conform to that. – supercat May 08 '14 at 15:12
  • While some people might frown at such "loosey-goosey" contracts, they can be useful in cases where one may have a heterogeneous collection of objects with different abilities and wish to allow objects with certain abilities to use them even when other objects in the collection cannot. It may be easier, for example, to have a drawing program use a single collection for all objects regardless of their resizing modes, and have the UI code adjust its behavior slightly as needed, than to have separate classes for `FreelyResizableShape`, `FixedAspectRatioShape`, etc. – supercat May 08 '14 at 15:18
  • @supercat Re: loosey-goosey contracts, I won't say "never do it", but "it's easier..." is very often the start of a slippery-slope towards hard-to-find bugs, and ends up not being all that easy in the long run. Since you're going to have to deal with their idiosyncrasies either way, you could use an algebraic data type/tagged union to shove different kinds of objects into the same collection and still have a type-safe way of telling what's what. – Doval May 08 '14 at 15:27
  • That's nice in theory, but I don't know of any language or framework whose type system is sufficiently expressive to make it practical. Even if there were only four abilities which objects could independently possess or lack, .NET or Java would require the definition of fifteen interfaces to cope with that. In practice, the number of abilities is greater than four, and thus the required number of interfaces would be vastly higher. Worse, ... – supercat May 08 '14 at 15:36
  • ...there are many situations where the abilities of an aggregate will depend upon the abilities of its members, in ways that the type system cannot capture. For example, an object returned from `IEnumerable.Append` should be addressable by index if and only if both source collections are thus addressable *and* the first collection promises that its length will never change. Can you think of any reasonable way a type system could support that? – supercat May 08 '14 at 15:44
  • @supercat Ah, I was thinking more along the lines of "there are N disjoint types of widgets" not "there are N types of capabilities that can be permuted in N! ways for any given object." An ADT would be pointless there, I'd just query their individual capabilities with `instanceof`. But I'd make it a point that their superinterface only exposes things that they're all guaranteed to be able to support. – Doval May 08 '14 at 15:50
  • @supercat Re: `IEnumerable.Append`, seems to me that whether a collection is resizable or not should be reflected in the interface types, along with whether they're randomly addressable. Couldn't you have an `IRandAccessFixedSize` that extends both `IRandAccess` and `IFixedSize` and put that particular type of `Append` there? – Doval May 08 '14 at 15:54
  • First of all, a type-based approach would require that something like `IEnumerable.Append` identify, before creating the new object, every type of operation it should support. This work would be necessary even if none of the optional methods actually get used. Secondly, there are some cases where an object might gain abilities during its lifetime. For example, an enumerable attached to a stream might not initially be able to guarantee that its contents could be read without blocking, nor that the quantity of data was bounded, but might gain that ability once the writer is closed. – supercat May 08 '14 at 17:03
  • IMHO, the only "problem" with having an interface like `IEnumerable` include a wide range of methods and properties is that there is at present no mechanism via which interfaces can specify default implementations for their members. Having `IEnumerable` include things like a `Count` method whose default implementation would call `GetEnumerator` and then call `MoveNext` until it returned `false`, and an `Capabilities` property to indicate among other things whether `Count` would be fast, slow, or unusable wold have been much cleaner than requiring client code to use try-casts. – supercat May 08 '14 at 17:08
  • @supercat I don't see the problem; there'd be different implementations of `IEnumerable.Append` depending on the particular interfaces the object supports. An object implementing `IRandomAccessFixedSizeCollection` implements a different `Append` than one implementing `IRandomAccessVariableSizeCollection`, and the return types would reflect those differences. Regarding your stream example, have `close()` return a new collection with the right interface, backed by the same data structure the stream was. Alternatively, decorators. – Doval May 08 '14 at 17:12
  • Method overloads on generic types are evaluated purely based upon constraints. Thus, if one wanted to e.g. write a `Repeat` method which would take an `IEnumerable` and an integer `n` and produce a composite with `n` copies strung back to back, either the `IEnumerable.Append` method or the `Repeat` method would have to examine the run-time types of the collections being appended, or else the a separate `Repeat` method would have to be written for every combination of abilities. Further, to use another examine of runtime ability enhancement, if a type is expensive to enumerate... – supercat May 08 '14 at 17:54
  • ...and doesn't initially know how many items it will be able to produce, it may be useful for it to hold a cached-count value, initially set to -1, and have the first enumerator that reaches the end indicate how many items were read. The first call to `ToList` on such a collection should use a dynamically-expanding list, but later calls should size the list directly to the correct value. How could that be handled better than by having the collection's abilities change at runtime? – supercat May 08 '14 at 17:56
  • @supercat Yes, you'll write a separate repeat for each combination of abilities, but you're already doing that anyways since the current `Append` does different things at different times. All you're doing is breaking out the separate implementations into their own functions. And I don't see the problem with the caching thing either - just weaken the contract ever so slightly so that you don't require `O(1)` runtime for size in all cases, just `O(1)` in the amortized case. Most data structures of any complexity provide amortized performance to begin with. – Doval May 08 '14 at 18:45
  • If there were a single interface with properties to describe what things an implementation could do well, poorly, or not at all, there would only need to be one `Repeat` method, which would call one `Append` method. That in turn could simply be a call to the constructor of a single `CompositeEnumerable` class which, if asked things like "are you quickly addressable by index", could ask the passed-in enumerables the appropriate questions (performance may be improved by having `Append` process a few cases specially, but semantically it wouldn't be required). – supercat May 08 '14 at 19:21
  • @supercat I'm not suggesting a single interface with a bunch of methods like `hasRandomAccess`, because that wouldn't give you the kind of static guarantees that a type system is nice for. Rather, I'm saying that if a collection supports random access, it should have an interface `IRandomAccess` with method `Get(int position)`, and if it doesn't, then it doesn't implement `IRandomAccess`. You wouldn't ask questions at run time, you'd assert that the object *must* have those capabilities at compile time. Since there's different interfaces, you'd get different verisons of `Append` with... – Doval May 08 '14 at 19:31
  • @supercat ...slightly different signatures. The `Append` in `IRandomAccessFixedSize` would take an `IRandomAccess` and return an `IRandomAccess`, whereas collections that don't have random access or fixed size would implement a slightly different `Append` that doesn't require the input to be an `IRandomAccess`. That way you'd know statically what kind you're going to get. – Doval May 08 '14 at 19:33
  • Suppose you want to write a method to retrieve the 10,000,001st item from a collection formed by concatenating some collections retrieved from a dictionary containing collections of various types, some randomly accessible and some not. If `Append` or the aggregate object it produces can query the subcollections at runtime, the operation may be fast. If it can't, the operation will be slow. It may be nice to have an overload of `Append` whose return type will statically guarantee that if the arguments promise high-speed access, its return will do so as well, but... – supercat May 08 '14 at 20:27
  • ...that doesn't eliminate the need to *also* have a means by which `Append` can take things which can't promise high-speed access at compile-time but *might* be able to promise it at run-time, and construct from those a composite object which will be able to offer high-speed access when its constituents do. – supercat May 08 '14 at 20:31
  • @supercat Nothing stops you from having the composite collection implement an alternate `Append` that does precisely what you said. The composite would not implement `IRandomAccess`, but not implementing that doesn't need to mean it can't sometimes be fast, just that it's not guaranteed. You could still assert some collections support random access at compile time and you were going to have to keep track of the contents of the aggregate either way. That aside, that seems like a very esoteric use case. – Doval May 08 '14 at 20:40
  • If the composite doesn't implement `IRandomAccess`, by what means could a client ask for the 10,000,001st item without enumerating the first 10,000,000? I'm not sure the use case is esoteric; how often do programmers call `IEnumerable.ToList()` because they really need a `List`, and how often do they call it because they need to fulfill a need which the original data source might have been able to fill, had there been a means of asking it? – supercat May 08 '14 at 22:17
  • @supercat There's no reason you can't have a separate interface with its own `Get(int position)` element with the specification that such calls are not guaranteed to have a run time smaller than `O(n)` (but are also not guaranteed to be that slow, either). The usual implementation would generally be to simply iterate over the first `n` elements, but you could implement your smart logic to check which subcollection it's stored in and retrieve it faster. – Doval May 08 '14 at 22:36
  • I think that returns to my point, which is that if interfaces could include default member implementations, having `IEnumerable` include things like an get-by-index property, methods for `Count()`, `Snapshot()`, `CopyRangeToArray()`, and `GetEnumeratorStartingAt(ref int)`, and properties to say which features would be useful on a given instance, etc. would have allowed much more efficient code than would having separate interfaces for every capability a thing may possess or lack. – supercat May 08 '14 at 22:57
  • @supercat I think more so than default implementations, what languages really need is a concise way to delegate to objects - some sort of syntax you could use to say "implement IFoo by calling the corresponding methods in this IFoo object, except for the ones I implement." That'd solve both the default implementation problem and would make reusing behavior as easy as implementation inheritance without all the pitfalls that come with it. D implements something like that with its [alias this](http://dlang.org/class.html#AliasThis) syntax. – Doval May 09 '14 at 02:02
  • Both things would be very helpful. Default implementations would allow interface designers to fix omissions rather than requiring that everyone live with them. The `alias this` in D sounds like something I've wished for, though I've thought of it in terms of "eager type conversions" [as opposed to merely "implicit"]; there's a bit of a difference between that and full auto-delegation, though. If `George` is a member of class `Foo` which auto-delegates `IBar` to field `Larry`, then `IBar Joe = someFoo` should store a reference to `George`, not a reference to `George.Larry`. – supercat May 09 '14 at 16:57
  • **Please refrain from extended discussions in the comments section. Comments are to be used for clarification or improving upon the question or answer. If you would like to continue the discussion then please use our chat feature. Thank you.** – maple_shaft May 10 '14 at 12:32
0

Take the classic example for inheritance - Dog : Animal where Dog is a class that inherits from Animal. While Animal can eat, a dog can bark, jump, and swim. So you add these methods to Dog.

It makes sense from a relation point of view. Why should Animal be able to bark? Why shouldn't Dog be able to bark? In fact, nobody is claiming otherwise. You see this type of relation often in code, however, whenever you need to use bark, you simultaneously need to know if it is a Dog. Therefore any usage of bark automatically adds a direct dipendency on Dog any way you slice it. Even if you take an Animal and check if it is a Dog, and then act accordingly, you're no longer acting on the general case of handling Animal. There's nothing stopping you from diong this, though you've lost any advantage you had with inheritance.

In order to truly take advantage of inheritance, you must let inherited classes represent various implementations of the super class. In this way, it isn't enough that a class "is a type of" another class. It must also play the part of the super class with minor exceptions such as during its creation which is the only point in your program that should know the specific implementation being used.

So perhaps a better example wouldn't be Dog : Animal but rather Square : Drawable, where Drawable is an object that can be called to draw itself regardless of how.

Neil
  • 22,670
  • 45
  • 76
0

The logical result of this is that a specialization might only accept some values of input for its method or might change some values in order to stay consistent. But isn't that OK, since guarding its internal consistency is what an object should do?

The issue isn't really what the object does internally to implement the message it is asked to perform (eg setWidth). The issue is what the caller understands the end result of the message to be.

Both a setWidth method on a rectangle that only sets width, and a setWidth method on a square that sets both width and height, are perfectly valid so long as it is clear to anyone making the calls what these methods do.

The problem is when you say a square is a rectangle, and then look at the setWidth method of a rectangle. That method puts the object into a particular state. It is understood by all who call the method that it will update the width and only the width.

When you say a square is a rectangle you are saying to anyone working with the square that everything you knew about rectangles still holds when working with squares. And one of the things that the programmer knew about rectangles was that setWidth updated only the width. That is known behaviour.

So the programmer knows what this method does, she knows that square is a rectangle, so she knows that setWidth will only update width.

Except it doesn't only update width, it now updates height as well. The program has just lied to the programmer, it has broken its contract.

This example might seem trivial, but it becomes much more important when you factor in polymorphism.

You might not know what shape you have, you might not know if you have a rectangle or a square. If you know that a square is a rectangle then you know that what ever object you have it will behave the way you expect a rectangle to behave if you apply rectangle behaviour to it.

But of course if it doesn't then you have a serious problem because you now cannot trust your objects to behave the way you expect them to, and thus you cannot trust your code to do what you expect it to do.

Instead what you need to do is break the connection, tell the programmer that a square is NOT a rectangle and that the programmer needs to know if they have a square or not and they need to understand what behaviour a square specifically has because it is not the same as a rectangle.

Cormac Mulhall
  • 5,032
  • 2
  • 19
  • 19