53

Why is this OK and mostly expected:

abstract type Shape
{
   abstract number Area();
}

concrete type Triangle : Shape
{
   concrete number Area()
   {
      //...
   }
}

...while this is not OK and nobody complains:

concrete type Name : string
{
}

concrete type Index : int
{
}

concrete type Quantity : int
{
}

My motivation is maximising the use of type system for compile-time correctness verification.

PS: yes, I have read this and wrapping is a hacky work-around.

Den
  • 4,827
  • 2
  • 32
  • 48
  • 1
    Comments are not for extended discussion; this conversation has been [moved to chat](http://chat.stackexchange.com/rooms/43814/discussion-on-question-by-den-why-do-mainstream-strong-static-oop-languages-prev). – maple_shaft Aug 11 '16 at 16:51
  • I had a similar motivation in [this question](http://programmers.stackexchange.com/q/309081/39776), you might find it interesting. – default.kramer Aug 12 '16 at 15:06
  • I was going to add an answer confirming the "you don't want inheritance" idea, and that wrapping _is_ very powerful, including giving you whichever of implicit or explicit casting (or failures) you want, especially with JIT optimisations suggesting you'll get almost the same performance anyway, but you've [linked](http://programmers.stackexchange.com/a/281831/6402 "Suggested to be a hacky workaround") to that answer :-) I would only add, it would be nice if languages added features to reduce boilerplate code needed for forwarding properties/methods, especially if there's only a single value. – Mark Hurd Aug 17 '16 at 03:56

10 Answers10

83

I assume you are thinking of languages like Java and C#?

In those languages primitives (like int) are basically a compromise for performance. They don't support all features of objects, but they are faster and with less overhead.

In order for objects to support inheritance, each instance need to "know" at runtime which class it is an instance of. Otherwise overridden methods cannot be resolved at runtime. For objects this means instance data is stored in memory along with a pointer to the class object. If such info should also be stored along with primitive values, the memory requirements would balloon. A 16 bit integer value would require its 16 bits for the value and additionally 32 or 64 bit memory for a pointer to its class.

Apart from the memory overhead, you would also expect to be able to override common operations on primitives like arithmetic operators. Without subtyping, operators like + can be compiled down to a simple machine code instruction. If it could be overridden, you would need to resolve methods at runtime, a much more costly operation. (You may know that C# supports operator overloading - but this is not the same. Operator overloading is resolved at compile time, so there is no default runtime penalty.)

Strings are not primitives but they are still "special" in how they are represented in memory. For example they are "interned", which means two strings literals which are equal can be optimized to the same reference. This would not be possible (or a least a lot less effective) if string instances should also keep track of the class.

What you describe would certainly be useful, but supporting it would require a performance overhead for every use of primitives and strings, even when they don't take advantage of inheritance.

The language Smalltalk does (I believe) allow subclassing of integers. But when Java was designed, Smalltalk was considered too slow, and the overhead of having everything be an object was considered one of the main reasons. Java sacrificed some elegance and conceptual purity to get better performance.

Tulains Córdova
  • 39,201
  • 12
  • 97
  • 154
JacquesB
  • 57,310
  • 21
  • 127
  • 176
  • I wanted it to be language-agnostic, but your answer makes sense. Doesn't explain C#'s ```string``` though. – Den Aug 10 '16 at 11:30
  • @Den - You can't inherit `string` because it's a sealed class, it's not a language restriction. – Lee Aug 10 '16 at 11:53
  • @Lee I am aware of this. Why is it sealed? I tried to search but all SO answers are quite lame. – Den Aug 10 '16 at 12:24
  • @JacquesB Smalltalk is the reason I added "static" into the question :). – Den Aug 10 '16 at 12:26
  • 13
    @Den: `string` is sealed because it is designed to behave immutable. If one could inherit from string, it would be possible to create mutable strings, which would make it really error prone. Tons of code, includung the .NET framework itself, relies on strings having no side-effects. See also here, tells you the same: https://www.quora.com/Why-String-class-in-C-is-a-sealed-class – Doc Brown Aug 10 '16 at 13:42
  • 5
    @DocBrown This is also the reason `String` is marked `final` in Java as well. – Dev Aug 10 '16 at 13:50
  • 47
    "when Java was designed, Smalltalk was considered too slow […]. Java sacrificed some elegance and conceptual purity to get better performance." – Ironically, of course, Java didn't actually gain that performance until Sun bought a Smalltalk company to get access to Smalltalk VM technology because Sun's own JVM was dog-slow, and released the HotSpot JVM, a slightly modified Smalltalk VM. – Jörg W Mittag Aug 10 '16 at 14:57
  • 2
    @DocBrown Python has immutable strings (and ints) but they can be inherited. [they can't participate in multiple inheritance though] – Random832 Aug 10 '16 at 15:15
  • @DocBrown I don't see how immutable string with private state can be made mutable. Unless you add new mutable state to it. Which wouldn't make the base state mutable. It would still behave immutable when accessed as base type. – Den Aug 10 '16 at 15:48
  • 1
    @Random832: Python programmers and framework developers also have different expectations for how the strings work. .NET could have allowed inhereting from `string`, but the .NET designers concluded the benefits weren't worth the risks/costs. – whatsisname Aug 10 '16 at 16:49
  • 2
    @Den: lets assume it would be allowed to inherit from `string`. Then, to make the class work reasonable and allowing added mutable state, you need to allow overriding of the `GetHashCode` method (in C#), and include the mutable state variable in the hash code calculation. Now code which uses this method with string parameters expects GetHashCode always to deliver the same result from the same string object, once its constructed. But that is now not true any more, you can have "derived" strings which deliver different GetHashCode results through the lifetime of the object. – Doc Brown Aug 10 '16 at 18:43
  • 1
    ... so this is a violation of LSP, or call it simply unwanted side effects. – Doc Brown Aug 10 '16 at 18:44
  • 1
    By the way: `int` is not primitive in C♯, it's an object like any other (well, a *value object* …). In fact, C♯ doesn't have primitives at all. – Jörg W Mittag Aug 10 '16 at 20:08
  • @Random832: sure, the design goals for Python differ from the design goals for C# and Java clearly. – Doc Brown Aug 10 '16 at 20:13
  • @JörgWMittag Could you elaborate with reference to [this answer](http://stackoverflow.com/a/16589255/2757035)? According to that, .NET has primitives, and C# depends upon .NET, so aren't the practical results ultimately the same? i.e. C# _does_ have primitive types, regardless of in which standard they originate. – underscore_d Aug 10 '16 at 21:47
  • 3
    @underscore_d: The answer you linked to very explicitly states that C♯ does *not* have primitive types. Sure, some platform for which there exists an implementation of C♯ may or may not have primitive types, but that does not mean that C♯ has primitive types. E.g., there is an implementation of Ruby for the CLI, and the CLI has primitive types, but that does not mean that Ruby has primitive types. The implementation may or may not choose to implement value types by mapping them to the platform's primitive types but that is a private internal implementation detail and not part of the spec. – Jörg W Mittag Aug 10 '16 at 22:07
  • 1
    I'm not sure whether that is true anymore, but there *were* implementations of C♯ for platforms other than the CLI in the past. I know of a native one and one for ECMAScript 3, in particular. – Jörg W Mittag Aug 10 '16 at 22:08
  • 10
    It's all about abstraction. We have to keep our head clear, otherwise we end up with nonsense. For example: C♯ is implemented on .NET. .NET is implemented on Windows NT. Windows NT is implemented on x86. x86 is implemented on silicone dioxide. SiO₂ is just sand. So, a `string` in C♯ is just sand? No, of course not, a `string` in C♯ is what the C♯ spec says it is. How it is implemented is irrelevant. A native implementation of C♯ would implement strings as byte arrays, an ECMAScript implementation would map them to ECMAScript `String`s, etc. – Jörg W Mittag Aug 10 '16 at 22:13
  • 3
    @JörgWMittag If, in C#, I check the "IsPrimitive" value of a type and it says "true", to me it is a primitive. They are not primitives in the Java sense, sure, but they are semantically primitives. – T. Sar Aug 11 '16 at 18:20
  • 1
    Wha? Just realised this answer has the same problems as [one I grumbled about](http://programmers.stackexchange.com/a/328119/192238) - which now makes me wonder why _that_ has a score of 0 whereas _this_ has 61. The idea that inheritance requires RTTI and consumes space is (i) false, (ii) only true _iff_ one takes it upon themselves to assume, without basis from the OP, (A) 2 specific languages & (B) inheritance in a restricted definition of 'with virtual dispatch', which other significant languages & their users do not. Otherwise, space/speed are not an argument. Besides, it's an X/Y question – underscore_d Aug 11 '16 at 23:19
  • @underscore_d: I took "mainstream strong static OOP languages" to mean Java and C# and similar languages. But if you can elaborate on what other languages you talk about, it would be interesting. – JacquesB Aug 12 '16 at 07:01
  • Hi @JacquesB, I hope you can find more elaboration than you ever wanted in my comments on the linked post ;) ...which I should probably back up, seeing as comments are ephemeral and all. Suffice it to say I interpret C++ as a "mainstream strong static OOP language", and it has all the significant differences I summarised in my previous comment here. – underscore_d Aug 12 '16 at 08:36
  • @Jörg The C# specification actually does specify implementation constraints for simple types. Which should make it all but impossible for a ECMAScript implementation (from what I know about ECMAScript that is, not my speciality) to follow the spec completely. The specification states that `int` is a signed 2s complement integer with 32 bit. This is observable since the specification also describes the memory layout in specific situations (particularly unsafe code and interop with native code). Hence you cannot implement int as an 8 byte type without violating the spec. – Voo Aug 12 '16 at 19:47
  • You may be interested in the "value types" proposal for Java 9: http://cr.openjdk.java.net/~jrose/values/values-0.html (essentially allows user-defined primitives which is kind of like what you want) – stackexchanger Aug 12 '16 at 20:15
  • @stackexchanger: The value types in the proposal does not allow inheritance, just like primitives and values types in .net. – JacquesB Aug 22 '16 at 10:39
  • True, I was suggesting that as the next best solution, given that Java doesn't have primitives with inheritance. It also provides a nice discussion of why primitives don't have inheritance. – stackexchanger Aug 22 '16 at 22:05
20

What some language propose is not subclassing, but subtyping. For example, Ada lets you create derived types or subtypes. The Ada Programming/Type System section is worth reading to understand all details. You can restrict the range of values, which is what you want most of the time:

 type Angle is range -10 .. 10;
 type Hours is range 0 .. 23; 

You can use both types as Integers if you convert them explicitly. Note also that you can't use one in place of another, even when the ranges are structurally equivalent (types are checked by names).

 type Reference is Integer;
 type Count is Integer;

Above types are incompatible, even though they represent the same range of values.

(But you can use Unchecked_Conversion; don't tell people I told you that)

coredump
  • 5,895
  • 1
  • 21
  • 28
  • 2
    Actually, I think it is more about semantics. Using a quantity where an index is expected would then hopefully cause a compile time error – Marjan Venema Aug 10 '16 at 10:36
  • @MarjanVenema It does, and this is done on purpose to catch logic errors. – coredump Aug 10 '16 at 10:41
  • My point was that not all cases where you want the semantics, you'd need the ranges. You would then have `type Index is -MAXINT..MAXINT;`which somehow doesn't do anything for me as all integers would be valid? So what kind of error would I get passing an Angle to an Index if all that is checked are the ranges? – Marjan Venema Aug 10 '16 at 10:44
  • 1
    @MarjanVenema In she second example both types are subtypes of Integer. However, if you declare a function which accepts a Count, you cannot pass a Reference because type checking is based on *name equivalence*, which is the contrary of "all that is checked are the ranges". This is not limited to integers, you could use enumerated types or records. (http://archive.adaic.com/standards/83rat/html/ratl-04-03.html) – coredump Aug 10 '16 at 10:54
  • Ah, so the `type xxx is Integer` are equivalent to Delphi's (my original language) type aliases but a bit more strict. Delphi would allow passing any assignment compatible type except for `var` parameters where types would have to be exactly equal. Cool. – Marjan Venema Aug 10 '16 at 11:20
  • @MarjanVenema Ada is inspired by Pascal. See also [Ada for Pascal Programmers](http://collaboration.cmc.ec.gc.ca/science/rpn/biblio/ddj/Website/articles/DDJ/1988/8809/8809b/8809b.htm). – coredump Aug 10 '16 at 11:23
  • I didn't notice this earlier. As I said in my subsequent answer, it sounds like this is _really_ what the OP wants. They have mistakenly decided that they need inheritance and asked here why it can't work like they want it to, without realising that what they're asking for - non-substitutability - is diametrically opposed to its purpose. Excellent point and examples! – underscore_d Aug 10 '16 at 21:43
  • 1
    @Marjan One nice example of why tagging types can be quite powerful can be found in Eric Lippert's [series on implementing Zorg in OCaml](https://ericlippert.com/2016/02/05/forest_path/). Doing this allows the compiler to catch lots of bugs - on the other hand if you allow to implicitly convert types this seems to make the feature useless.. it doesn't make semantic sense being able to assign a PersonAge type to a PersonId type just because they both happen to have the same underlying type. – Voo Aug 12 '16 at 19:59
  • @Voo No need convincing me. I like specific typing. :) Yes, Delphi's assignment compatibility approach is less than helpful. Using var (essentially in/out) parameters everywhere is something I don't even want to consider. May have changed, I haven't kept up-to-date with the latest versions. – Marjan Venema Aug 13 '16 at 21:16
17

I think this might very well be an X/Y question. Salient points, from the question...

My motivation is maximising the use of type system for compile-time correctness verification.

...and from your comment elaborating:

I don't want to be able to substitute one for another implicitly.

Excuse me if I'm missing something, but... If these are your aims, then why on Earth are you talking about inheritance? Implicit substitutability is... like... its entire thing. Y'know, the Liskov Substitution Principle?

What you seem to want, in reality, is the concept of a 'strong typedef' - whereby something 'is' e.g. an int in terms of range and representation but cannot be substituted into contexts that expect an int and vice-versa. I'd suggest searching for info on this term and whatever your chosen language(s) might call it. Again, it's pretty much literally the opposite of inheritance.

And for those who might not like an X/Y answer, I think the title might still be answerable with reference to the LSP. Primitive types are primitive because they do something very simple, and that's all they do. Allowing them to be inherited and thus making infinite their possible effects would lead to great surprise at best and fatal LSP violation at worst. If I may optimistically assume Thales Pereira won't mind me quoting this phenomenal comment:

There is the added problem that If someone was able to inherit from Int, you would have innocent code like "int x = y + 2" (where Y is the derived class) that now writes a log to the Database, opens a URL and somehow resurrect Elvis. Primitive types are supposed to be safe and with more or less guaranteed, well-defined behavior.

If someone sees a primitive type, in a sane language, they rightly presume it will always just do its one little thing, very well, without surprises. Primitive types have no class declarations available that signal whether they may or may not be inherited and have their methods overridden. If they were, it would be very surprising indeed (and totally break backwards compatibility, but I'm aware that's a backwards answer to 'why was X not designed with Y').

...although, as Mooing Duck pointed out in response, languages that allow operator overloading enable the user to confuse themselves to a similar or equal extent if they really want, so it's dubious whether this last argument holds. And I'll stop summarising other people's comments now, heh.

underscore_d
  • 428
  • 3
  • 12
4

In mainstream strong static OOP languages, sub-typing is seen primarily as a way to extend a type and to override the type's current methods.

To do so, 'objects' contain a pointer to their type. This is a overhead: the code in a method that uses a Shape instance first has to access the type information of that instance, before it knows the correct Area() method to call.

A primitive tends to only allow operations on it that can translate into single machine language instructions and do not carry any type information with them. Making an integer slower so that someone could subclass it was unappealing enough to stop any languages that did so becoming mainstream.

So the answer to:

Why do mainstream strong static OOP languages prevent inheriting primitives?

Is:

  • There was little demand
  • And it would have made the language too slow
  • Subtyping was primarily seen as a way to extend a type, rather than a way to get better (user-defined) static type checking.

However, we are starting to get languages that allow static checking based on properties of variables other then 'type', for example F# has "dimension" and "unit" so that you can't, for example, add a length to an area.

There are also languages that allow 'user-defined types' that don't change (or exchange) what a type does, but just help with static type checking; see coredump's answer.

underscore_d
  • 428
  • 3
  • 12
Ian
  • 4,594
  • 18
  • 28
  • F# units of measure is a nice feature, although unfortunately misnamed. Also it's compile-time only, so not super-useful e.g. when consuming a compiled NuGet package. Right direction, though. – Den Aug 10 '16 at 12:33
  • It's perhaps interesting to note that "dimension" is not "a property other than 'type'", it's just a more rich kind of type than you may be used to. – porglezomp Aug 11 '16 at 19:18
4

In order to allow inheritance with virtual dispatch 8which is often considered quite desirable in application design), one needs runtime type information. For every object, some data regarding the type of the object has to be stored. A primitive, per definition, lacks this information.

There are two (managed, run on a VM) mainstream OOP languages that feature primitives: C# and Java. Many other languages do not have primitives in the first place, or use similar reasoning for allowing them / using them.

Primitives are a compromise for performance. For each object, you need space for its object header (In Java, typically 2*8 bytes on 64-bit VMs), plus its fields, plus eventual padding (In Hotspot, every object occupies a number of bytes that is a multiple of 8). So an int as object would need at least 24 bytes of memory to be kept around, instead of only 4 bytes (in Java).

Thus, primitive types were added to improve performance. They make a whole lot of things easier. What does a + b mean if both are subtypes of int? Some kind of dispathcing has to be added to choose the correct addition. This means virtual dispatch. Having the ability to use a very simple opcode for the addition is much, much faster, and allows for compile-time optimizations.

String is another case. Both in Java and C#, String is an object. But in C# its sealed, and in Java its final. That because both the Java and C# standard libraries require Strings to be immutable, and subclassing them would break this immutability.

In case of Java, the VM can (and does) intern Strings and "pool" them, allowing for better performance. This only works when Strings are truly immutable.

Plus, one rarely needs to subclass primitive types. As long as primitives can not be subclassed, there are a whole lot of neat things that maths tells us about them. For example, we can be sure that addition is commutative and associative. Thats something the mathematical definition of integers tells us. Furthermore, we can easily prrof invariants over loops via induction in many cases. If we allow subclassing of int, we loose those tools that maths gives us, because we no longer can be guaranteed that certain properties hold. Thus, I'd say the ability not to be able to subclass primitive types is actually a good thing. Less things someone can break, plus a compiler can often proof that he is allowed to do certain optimizations.

Polygnome
  • 2,039
  • 15
  • 15
  • 1
    This answer is abys... narrow. `to allow inheritance, one needs runtime type information.` False. `For every object, some data regarding the type of the object has to be stored.` False. `There are two mainstream OOP languages that feature primitives: C# and Java.` What, is C++ not mainstream now? I'll use it as my rebuttal as _runtime type information_ **is** a C++ term. It's absolutely not required unless using `dynamic_cast` or `typeid`. And _even if_ RTTI's on, inheritance only consumes space if a class has `virtual` methods to which a per-class table of methods must be pointed per instance – underscore_d Aug 10 '16 at 21:09
  • 1
    Inheritance in C++ works a whole lot different then in languages run on a VM. virtual dispatch requires RTTI, something that wasn't oiginally part of C++. Inheritance without virtual dispatch is very limited and I'm not even sure if you should compare it to inheritance with virtual dispatch. Furthermore, the notion of an "object" is very different in C++ then it is in C# or Java. You are right, there are some things i could word better, but tbh getting into all the quite involved points quickly leads to having to write a book on language design. – Polygnome Aug 10 '16 at 21:17
  • Right, and for a much narrower question deliberately restricted to languages in a VM and/or with your preferred definition of _inheritance_, then your answer is chock full of good points and might indicate that you _could_ write a book. But my concern is we weren't asked _that_ question. By omitting to mention your choice to address only 2 "mainstream strong static OOP languages", of which IMO C++ ticks all the boxes, then the good included points are devalued by what the omitted caveats imply and said omissions paint a very misleading picture of the overall topic. But that's just how I see it – underscore_d Aug 10 '16 at 21:22
  • @underscore_d I have reworded my first paragraphs. Please note that my later paragraphs also apply to C++ (and many compiler use that knowledge *extensively*). – Polygnome Aug 10 '16 at 21:24
  • 3
    Also, it is not the case that "virtual dispatch requires RTTI" in C++. Again, only `dynamic_cast` and `typeinfo` require that. Virtual dispatch is practically implemented using a pointer to the vtable for the concrete class of the object, thus letting the right functions be called, but it does not require the detail of type and relation inherent in RTTI. All the compiler needs to know is whether an object's class is polymorphic and, if so, what the instance's vptr is. One can trivially compile virtually dispatched classes with `-fno-rtti`. – underscore_d Aug 10 '16 at 21:32
  • 2
    It's in fact the other way arround, RTTI requires virtual dispatch. Literally -C++ doesn't allow `dynamic_cast` on classes without virtual dispatch. The implementation reason is that RTTI is generally implemented as a hidden member of a vtable. – MSalters Aug 11 '16 at 08:09
  • While the focus of this answer is a bit narrow (by ignoring non-VM languages), I don't think, it deserves any downvotes: Within its narrow focus, it's correct. – cmaster - reinstate monica Aug 11 '16 at 08:48
  • @cmaster One thing I think we can all agree on, at least after the edit, is that it's bizarre how this answer is sitting at a score of 0 while another answer that applies the same subjective restrictions of definitions and resulting exclusion of languages/situations where things are _very_ different (and overlooks how the OP was asking an X/Y question about the polar opposite of inheritance) - and then (to be fair) makes almost the exact same series of good points this one did despite its limitations - is flying high at +60. – underscore_d Aug 11 '16 at 23:10
  • @underscore_d That's the effect of the early bird that gets the worm. That high-flying answer was one of the first (if not the first), attracted one or two upvotes first which pushed it to the top. As such, it could collect votes from all the people looking at this question, many of which do not reach the bottom of the page once a few answers have accumulated. This answer probably had a downvote first, which dropped it to the bottom of the list, and biased future readers with the strong signal "This is crap!". Even if the answer is good in itself, one or two early downvotes tend to be fatal. – cmaster - reinstate monica Aug 11 '16 at 23:22
  • @cmaster right, the difference is only bizarre for as long as one naively assumes that the majority of voters have measurable attention spans... such a shame that 1 downvote can have the unintended domino effect of dooming an answer that still has redeeming qualities and certainly is on a par with one at the complete other end of the scale :( – underscore_d Aug 11 '16 at 23:26
  • @underscore_d C++ isn't a mainstream OOP language, because it isn't an OOP language. – Miles Rout Aug 18 '16 at 14:16
  • @MilesRout It's a multi-paradigm language, of which OOP is one of the multiple paradigms. What's your point? The OP didn't exclude languages that are supersets of their stated criteria, so why should we? – underscore_d Aug 18 '16 at 14:20
  • @underscore_d C++ does not support object-oriented programming. – Miles Rout Aug 19 '16 at 01:13
  • @MilesRout Please link me to your or somebody else's comprehensive rationale for that statement. Judging by your rep across the network, I am not about to take on faith - or without challenge - statements you make that plainly contradict what everyone else says about the language. Are you saying this from an absurdly puritanical definition of OOP, like some of the quoted objections in this thread? http://stackoverflow.com/questions/3498730/is-c-an-object-oriented-language – underscore_d Aug 19 '16 at 08:57
  • 1
    @MilesRout C++ has everything a language needs for OOP, at least the somewhat newer standards. One might argue that the older C++ standards lack some things that are needed fo an OOP language, but even that is a stretch. C++ is not a *high level* OOP language, as it allows a more direct, low level control over some things, but it allows OOP nonetheless. (High level / Low level here in terms of *abstraction*, other language like managed ones abstract more of the system away then C++, hence their abstraction is higher). – Polygnome Aug 19 '16 at 09:52
3

I'm not sure if I'm overlooking something here, but the answer is rather simple:

  1. The definition of primitives is: primitive values are not objects, primitive types are not object types, primitives are not part of the object system.
  2. Inheritance is a feature of the object system.
  3. Ergo, primitives cannot take part in inheritance.

Note that there are really only two strong static OOP languages which even have primitives, AFAIK: Java and C++. (Actually, I'm not even sure about the latter, I don't know much about C++, and what I found when searching was confusing.)

In C++, primitives are basically a legacy inherited (pun intended) from C. So, they don't take part in the object system (and thus inheritance) because C has neither an object system nor inheritance.

In Java, primitives are the result of a misguided attempt at improving performance. Primitives are also the only value types in the system, it is, in fact, impossible to write value types in Java, and it is impossible for objects to be value types. So, apart from the fact that primitives don't take part in the object system and thus the idea of "inheritance" doesn't even make sense, even if you could inherit from them, you wouldn't be able to maintain the "value-ness". This is different from e.g. C♯ which does have value types (structs), which nonetheless are objects.

Another thing is that not being able to inherit is actually not unique to primitives, either. In C♯, structs implicitly inherit from System.Object and can implement interfaces, but they can neither inherit from nor inherited by classes or structs. Also, sealed classes cannot be inherited from. In Java, final classes cannot be inherited from.

tl;dr:

Why do mainstream strong static OOP languages prevent inheriting primitives?

  1. primitives are not part of the object system (by definition, if they were, they wouldn't be primitive), the idea of inheritance is tied to the object system, ergo primitive inheritance is a contradiction in terms
  2. primitives are not unique, lots of other types cannot be inherited as well (final or sealed in Java or C♯, structs in C♯, case classes in Scala)
Jörg W Mittag
  • 101,921
  • 24
  • 218
  • 318
  • 3
    Ehm... I know it's pronounced "C Sharp", but, ehm – Mr Lister Aug 11 '16 at 06:45
  • I think you're pretty mistaken on the C++ side. It's not a pure OO language at all. Class methods by default are not `virtual`, which means they don't obey LSP. E.g. `std::string` isn't a primitive, but it very much behaves as just another value. Such value semantics are quite common, the whole STL part of C++ assumes it. – MSalters Aug 11 '16 at 08:05
  • 2
    'In Java, primitives are the result of a misguided attempt at improving performance.' I think you have no idea about the magnitude of the performance hit of implementing primitives as user expandable object types. That decision in java is both deliberate and well founded. Just imagine having to allocate memory for every `int` you use. Each allocation takes on the order of 100ns plus the overhead of garbage collection. Compare that with the single CPU cycle consumed by adding two primitive `int`s. Your java codes would crawl along if the designers of the language had decided otherwise. – cmaster - reinstate monica Aug 11 '16 at 08:34
  • 1
    @cmaster: Scala doesn't have primitives, and its numeric performance is exactly the same as Java's. Because, well, it compiles integers into JVM primitive `int`s, so they perform exactly the same. (Scala-native compiles them into primitive machine registers, Scala.js compiles them into primitive ECMAScript `Number`s.) Ruby doesn't have primitives, but YARV and Rubinius compile integers into primitive machine integers, JRuby compiles them into JVM primitive `long`s. Pretty much every Lisp, Smalltalk, or Ruby implementation uses primitives **in the VM**. That's where performance optimizations … – Jörg W Mittag Aug 11 '16 at 09:07
  • 1
    … belong: in the compiler, not the language. – Jörg W Mittag Aug 11 '16 at 09:08
  • @cmaster: C♯ also doesn't have primitives, and its performance seems to be just fine. – Jörg W Mittag Aug 11 '16 at 09:09
  • @JörgWMittag So that's the same argument as the one used by the C++ template enthusiasts: "Sure, it adds unnecessary operations, but the compiler can optimize those away in 99% of the cases." That is indeed a viable approach to the problem. Nevertheless, the compiler optimizations only remove the cruft that's previously inserted by the same tool, so I would not call them "optimizations", they are but cleanup. I prefer overhead not being added in the first place, it keeps things much simpler. But that is indeed a matter of taste. – cmaster - reinstate monica Aug 11 '16 at 09:41
  • *"I'm not sure if I'm overlooking something here"* yes, what you're overlooking is that your explanation is a circular argument. – Lie Ryan Aug 12 '16 at 23:52
  • @LieRyan: Primitives are *called* primitives because they sit outside of the object system, where inheritance lives. If one could inherit from primitives, they wouldn't be primitives any more. The question is similar to asking why odd numbers cannot be divided by 2: because the *definition* of an odd number is that it *cannot*. – Jörg W Mittag Aug 15 '16 at 02:49
  • C++ isn't an OOP language. – Miles Rout Aug 18 '16 at 14:16
  • @MilesRout again: http://softwareengineering.stackexchange.com/questions/328055/why-do-mainstream-strong-static-oop-languages-prevent-inheriting-primitives/328116#comment699116_328119 – underscore_d Oct 24 '16 at 18:19
  • @underscore_d C++ isn't an OOP language. Not really sure how else I can really explain this to you, maybe you should define what you think an OOP language and I can explain why it isn't? Either way, that should be in chat and not here. – Miles Rout Oct 28 '16 at 05:42
  • @MilesRout Saying "how else to explain this" implies you ever explained it in the first place. I don't see any such explanation/substantiation anywhere. If you're worried about this being OT, then better delete your original comments too. – underscore_d Oct 28 '16 at 07:50
  • @underscore_d my comment wasn't off-topic. it was a correction. Your comment, on the other hand, is off-topic. – Miles Rout Oct 28 '16 at 11:39
  • @MilesRout A correction that contradicts the majority evaluation of the language without substantiation is unproductive at best and flamebait at worst, but whatever you need to tell yourself. – underscore_d Oct 28 '16 at 12:03
  • @underscore_d Nobody that knows anything about C++ calls it a OOP language. Smalltalk is an OOP language. Python is an OOP language. C++ very clearly is not. – Miles Rout Oct 28 '16 at 21:24
  • @MilesRout It's not an **only**-OOP language, but it **is** a multi-paradigm language of which OOP is one of the available approaches... to a degree of 'purity' that you may or may not consider eligible for admission. If someone adheres to a definition of "OOP language" that emphasises focus/exclusivity on OO design, I guess they'd agree with you, but such comments are equally at risk of being read as an accusation that C++ is incapable of/inadequate at (some [IMO large] degree of) OOP generally, which clearly isn't the case. – underscore_d Oct 28 '16 at 21:59
  • @underscore_d You can write OOP code in C, with a few function pointers. That doesn't make it OOP either. – Miles Rout Oct 28 '16 at 22:20
2

Joshua Bloch in “Effective Java” recommends designing explicitly for inheritance or prohibiting it. Primitive classes are not designed for inheritance because they are designed to be immutable and allowing inheritance could change that in subclasses, thus break Liskov principle and it would be a source of many bugs.

Anyways, why is this a hacky workaround? You should really prefer composition over inheritance. If the reason is performance than you have a point and the answer to your question is that it is not possible to put all features in Java because it takes time to analyze all different aspects of adding a feature. For example Java didn't have Generics before 1.5.

If you have a lot of patience then you are lucky because there is a plan to add value classes to Java which will allow you to create your value classes which will help you increase the performance and in the same time it will give you more flexibility.

2

At the abstract level, you can include anything you want in a language you're designing.

At the implementation level, it's inevitable that some of those things will simpler to implement, some will be complicated, some can be made fast, some are bound to be slower, and so on. To account for this, designers often have to make hard decisions and compromises.

At the implementation level, one of the fastest ways we have come up for accessing a variable is finding out its address and loading the contents of that address. There are specific instructions in most CPUs for loading data from addresses and those instructions usually need to know how many bytes they need to load (one, two, four, eight, etc) and where to put the data they load (single register, register pair, extended register, other memory, etc). By knowing the size of a variable, the compiler can know exactly which instruction to emit for usages of that variable. By not knowing the size of a variable, the compiler would need to resort to something more complicated and probably slower.

At the abstract level, the point of subtyping is to be able to use instances of one type where an equal or more general type is expected. In other words, code can be written that expects an object of a particular type or anything more derived, without knowing ahead of time what exactly this would be. And clearly, as more derived types can add more data members, a derived type does not necessarily have the same memory requirements as its base types.

At the implementation level, there's no simple way for a variable of a predetermined size to hold an instance of unknown size and be accessed in a way you'd normally call efficient. But there is a way to move things around a little and use a variable not to store the object, but to identify the object and let that object be stored somewhere else. That way is a reference (e.g. a memory address) -- an extra level of indirection that ensures that a variable only needs to hold some kind of fixed-size information, as long as we can find the object through that information. To achieve that, we just need to load the address (fixed-size) and then we can work as usual using those offsets of the object that we know are valid, even if that object has more data at offsets we don't know. We can do that because we don't concern ourselves with its storage requirements when accessing it anymore.

At the abstract level, this method allows you to store a (reference to a) string into an object variable without losing the information that makes it a string. It's fine for all types to work like this and you might also say it's elegant in many respects.

Still, at the implementation level, the extra level of indirection involves more instructions and on most architectures it makes each access to the object somewhat slower. You can allow the compiler to squeeze more performance out of a program if you include in your language some commonly used types that don't have that extra level of indirection (the reference). But by removing that level of indirection, the compiler cannot allow you to subtype in a memory safe way anymore. That's because if you add more data members to your type and you assign to a more general type, any extra data members that don't fit in the space allocated for the target variable will be sliced away.

1

In general

If a class is abstract (metaphor: a box with hole(s)), it's OK (even required to have something usable !) to "fill the hole(s)", that's why we subclass abstract classes.

If a class is concrete (metaphor: a box full), it's not OK to alter the existing because if it's full, it's full. We have no room to add something more inside the box, that's why we shouldn't subclass concrete classes.

With primitives

Primitives are concrete classes by design. They represent something that is well-known, fully definite (I've never seen a primitive type with something abstract, otherwise it's not a primitive anymore) and widely used through the system. Allowing to subclass a primitive type and provide your own implementation to others that rely on the designed behaviour of primitives can cause a lot of side-effects and huge damages !

Spotted
  • 1,680
  • 10
  • 18
  • 2
    [Interesting reading about abstract and final classes](http://www.yegor256.com/2014/11/20/seven-virtues-of-good-object.html#7-his-class-is-either-final-or-abstract) – Spotted Aug 10 '16 at 12:02
  • The link is an interesting design opinion. Needs more thinking for me. – Den Aug 10 '16 at 12:30
1

Usually inheritance is not the semantics you want, because you can't substitute your special type anywhere a primitive is expected. To borrow from your example, a Quantity + Index makes no sense semantically, so an inheritance relationship is the wrong relationship.

However, several languages have the concept of a value type that does express the kind of relationship you are describing. Scala is one example. A value type uses a primitive as the underlying representation, but has a different class identity and operations on the outside. That has the effect of extending a primitive type, but it's more of a composition instead of an inheritance relationship.

Karl Bielefeldt
  • 146,727
  • 38
  • 279
  • 479