38

As has been covered to the point of parody, heavily object-oriented languages, such as C# or Java, tend to lack the feature of having functions as a primitive type. You can argue about whether or not their functions are first class, but the pattern is always that you cannot reference a function on its own; It must always have a companion object. C# calls these companions "delegates" and I believe Java calls them "runnables". In a language that doesn't need these companions, higher-order functions can be called with a function directly used as one of their arguments.

My question is this: Why, from a language design point of view, do heavily object-oriented languages tend to lack the ability to reference a function on its own? Why must they be carried around by a companion object? For example, what benefits do these companions provide over the functional-programming like approach of being able to have a function be its own thing? What costs do they avoid?

To pre-empt an objection, I do not believe that this question is opinion-based. The Java and C# designers are very well known for documenting their decisions and I am absolutely certain that they were familiar with heavily functional languages such as Scheme. There'll surely be a good and documented reason why they chose their current approach. In fact, with the rise of more functional languages that intermix with Java, such as Clojure, I wouldn't be surprised if there's already documentation of the costs that functional Java-like languages have paid by choosing the "functions don't need to be carried around by an object" path.

Finally, please note that I'm not specifically asking about C# or Java. Those two languages just happen to be the best examples of languages that insist that functions must be referenced with an object.

J. Mini
  • 997
  • 8
  • 20
  • 24
    I am somewhat confused by your question. First off, I would not characterize either Java or C# as "heavily object-oriented". Smalltalk or Self, yes, but Java and C#? No way. Secondly, you mention Scala, and in Scala, functions are very much *not* primitive types. Scala has no primitive types in the sense of Java (neither does C#, by the way). You can say that the only primitive type in Scala is objects. And functions *are* objects in Scala, in fact, in Scala any object which has a method called `apply` is essentially a function. Which is pretty much how functions in Java work: any type which … – Jörg W Mittag Oct 02 '22 at 19:48
  • 2
    … has a single method that is abstract is for all intents and purposes a function type. – Jörg W Mittag Oct 02 '22 at 19:48
  • @JörgWMittag I may simply be confused about Scala. I'll change that bit. – J. Mini Oct 02 '22 at 19:50
  • 1
    In Clojure, any object which implements `IFn` is a function. For example, vectors, maps. keywords, and sets, are functions. I still don't get what the distinction is that you are making, or what a "companion object" is. Scala has a concept of "companion object", but you don't need one for a function. And neither Java nor C# have a concept of "companion object". And I would bet that the vast majority of functions in Java code are not `Runnable`s. – Jörg W Mittag Oct 02 '22 at 19:57
  • @JörgWMittag I'm talking about the cases where you want to pass a function from one place to another. That's where you need runnables and delegates. – J. Mini Oct 02 '22 at 20:14
  • 5
    I still don't understand. E.g. if I pass a function to `Collections.sort`, there is no `Runnable` in sight. And there is certainly no "companion object", at least not how I understand the term. – Jörg W Mittag Oct 02 '22 at 20:16
  • 6
    I think it might be useful to look at this coming from the other side: Even in languages with first-class functions, there are essentially objects accompanying them. If they capture ("close over") any values, they need to store that in a heap-allocating object, alongside a pointer to the function that actually contains the instructions to execute. They're really quite the same. – Alexander Oct 02 '22 at 20:21
  • 11
    @Alexander: Indeed. A function is isomorphic to an object with a single method, a closure is isomorphic to an object with a single method and private state, an object is isomorphic to a closure taking a message name as an argument and returning a function. Interestingly, the OP mentions Scheme, which was explicitly designed to study object-orientation, and among all the languages listed by the OP is arguably the "most heavily object-oriented". – Jörg W Mittag Oct 02 '22 at 20:23
  • 3
    I am not aware of any mainstream language where functions are a primitive type. The only way to have that option is when the function has no capture context - such as function pointers in C/C++. But is that really a primitive type? – Tomáš Zato Oct 03 '22 at 09:33
  • 8
    In Java, a 'Runnable' is something that can be used to define a new thread. It is not the right abstraction for what you are asking. Prior to version 1.8, the closest thing to a function reference was an 'anonymous inner class' which could be used as a delegate. In versions 1.8+ there's are a set of abstractions, function references, and lambdas. I presume you are referring to the pre-1.8 Java here but it would be good for you to clarify, I think. – JimmyJames Oct 03 '22 at 13:07
  • 2
    Should we call Java and C# class-oriented languages instead of object-oriented? I think the question would describe these languages better if you just replace the word “object” with “class”. – ojs Oct 04 '22 at 06:26
  • @J.Mini c# and java are not OO. the question is confusing. – Fattie Oct 04 '22 at 18:00
  • I don't understand why you're dismissing delegates as function pointers. Can you elaborate why they don't meet your criteria? – T. Sar Oct 06 '22 at 20:39

10 Answers10

30

IMO...

  1. Because Java and C# are not true OO languages.
  2. Functional programming was not in vogue when they were designed.

I agree with Jörg W Mittag, neither C# nor Java are true object-oriented languages. They're hybrid procedural/OO languages attempting to improve on C++ (and C# attempting to improve on Java). They have both a traditional primitive type system and a class system. For example, int is a primitive type in Java and requires a wrapper to behave like an object.

Such hybrid design means the language designers get to pick and choose what is and is not an object. The designers of Java and C# didn't feel functions needed to be objects, I'm guessing because functional programming wasn't vogue back then and it was a little faster to not have them be objects. So they weren't objects.

In contrast, true object-oriented languages (Smalltalk, Ruby, Scalar, Eiffel, Emerald, Self, Raku) treat everything as an object which responds to methods. Everything. That includes methods and procedures. They're objects so they can be referenced. Methods being objects are inherent to OO design because everything is an object; the language designers would have to deliberately do otherwise.

For example, Ruby is a pure OO language. Since everything is an object, it has Method objects. Since everything is an object, the code of a Method is a Proc object. Lambdas are just special Procs.

func = lambda { |x | x**2 }

That's syntax sugar for Proc.new. func is an instance of Proc. I can call methods on it.

p func.call(4)  # 16
p func.class    # Proc            

I can ask it for its call method. That is a Method object.

method = func.method(:call)
p method.class    # Method
p method.call(4)  # 16

Anonymous functions and method references are a benefit of Ruby sticking to OO design principles.

In contrast, let's look at how Java and C# implemented lambdas and function references.


Java was designed in early 90s as a better C++. C++ is a hybrid procedural/OO language with many, many design problems, but it was extremely popular and influenced what many people thought OO was. Java inherited these problems.

Because Java is a hybrid language, they picked and chose what was and was not an object. For whatever reason, probably micro-optimization and a lack of appreciation for functional programming, they decided that methods were not objects and could not be referenced, and there were no bare functions.

Java attempted to address the need to pass around snippets of functionality with anonymous classes. They finally realized this is awkward and added lambda expressions explaining...

Lambda expressions let you express instances of single-method classes more compactly.

One issue with anonymous classes is that if the implementation of your anonymous class is very simple, such as an interface that contains only one method, then the syntax of anonymous classes may seem unwieldy and unclear. In these cases, you're usually trying to pass functionality as an argument to another method, such as what action should be taken when someone clicks a button. Lambda expressions enable you to do this, to treat functionality as method argument, or code as data.

At the same time they added method references...

You use lambda expressions to create anonymous methods. Sometimes, however, a lambda expression does nothing but call an existing method. In those cases, it's often clearer to refer to the existing method by name. Method references enable you to do this; they are compact, easy-to-read lambda expressions for methods that already have a name.

What you can do is use them to construct instances of any interface which is a FunctionalInterface.

Functional interfaces provide target types for lambda expressions and method references.

Runnable r = () -> System.out.println("Hello World!");

// The equivalent Runnable class.
Runnable r = new Runnable() {
   @Override
   public void run() {
    System.out.println("Hello World!");
   }
};

There's dozens of FunctionalInterfaces which seem to be working around Java's non-OO primitive types.

DoubleToIntFunction: Represents a function that accepts a double-valued argument and produces an int-valued result.

The hybrid nature of Java means rather than flowing naturally from the design, a complex series of adapters is necessary.

Point is, it was bolted on later and it's complicated.


C# was designed a better Java. It fixed some mistakes, and repeated others.

When you go back and look, C# version 1.0, released with Visual Studio .NET 2002, looked a lot like Java. As part of its stated design goals for ECMA, it sought to be a "simple, modern, general-purpose object-oriented language." At the time, looking like Java meant it achieved those early design goals.

C# 3.0 added lambda expressions. As with Java, the relationship between lambda expressions and objects is complicated...

Any lambda expression can be converted to a delegate type. The delegate type to which a lambda expression can be converted is defined by the types of its parameters and return value. If a lambda expression doesn't return a value, it can be converted to one of the Action delegate types; otherwise, it can be converted to one of the Func delegate types. For example, a lambda expression that has two parameters and returns no value can be converted to an Action<T1,T2> delegate. A lambda expression that has one parameter and returns a value can be converted to a Func<T,TResult> delegate.


This complex system of annotation types and target types and automatic casting and blurring between what is and is not an object is a common problem of hybrid languages, particularly C++ derived languages which mix procedural and object-oriented principles with a C-style type system without choosing one clear paradigm.

And that's my point. True object-oriented languages naturally have function references because functions are objects. Hybrid languages get to pick and choose what is and is not an object; if the language designer didn't think function references were needed, you don't get function references.

Schwern
  • 800
  • 5
  • 8
  • From one of the founders of this site, Joel Spolsky: "All non-trivial abstractions, to some degree, are leaky.". `int` is the CPU reality. Lambda's are not. CPU's are not OO. C++ leaks more, SmallTalk leas less, but everything leaks. – MSalters Oct 03 '22 at 12:48
  • 4
    "These are not objects, nor are they function references, but a special language feature. They're described in the Java docs awkwardly." The way you are describing lambdas doesn't align with how I understand them in Java. Java 8 (or 1.8) introduced function references. Lambdas are essentially a way to create an anonymous function. You don't need to use a lambda to pass a function. You can pass named class and member functions by name or assign them to variables. – JimmyJames Oct 03 '22 at 13:17
  • 8
    "*Lambda expressions […] express instances of […] classes*" sounds very much as if they were objects to me. – Bergi Oct 03 '22 at 15:15
  • Procedural and object oriented are orthogonal concerns, are they not? – nasch Oct 03 '22 at 15:30
  • @Bergi: Yes -- which indeed they are. There are no "function types" in Java; rather, if an interface or abstract class is such that you can implement it by defining just one method, then a lambda expression is a concise way to write that implementation. The type of the lambda expression will be the that of the relevant interface or abstract class (which needs to be inferable from the context). That said, even a functional language doesn't necessarily have "function types", if it's an untyped language like Scheme. :-) – ruakh Oct 03 '22 at 15:42
  • @Bergi IIRC, lambda expressions are actually implemented as anonymous classes in Java or something very similar to that anyway. There are a few technical details around Java that are a little off here but I don't think it matters that much in the context of the question. – JimmyJames Oct 03 '22 at 16:20
  • 1
    @JimmyJames I would think that it matters in the context of this answer, which claims that lambdas in java are not objects – Bergi Oct 03 '22 at 16:22
  • @JimmyJames Thanks, I missed that. – Schwern Oct 03 '22 at 17:35
  • @Bergi If you want to argue that a closure is an object with a single method in the strict "data + code" sense, ok. But is it a Java object? (Correct me if I'm wrong) Java lambdas aren't syntax sugar on top of an object (like Ruby procs are). You can't treat it like a Java object. It's a special expression. – Schwern Oct 03 '22 at 17:38
  • 3
    @Schwern I don't care whether the syntax is special or not, but what does the result look like to the receiver? Does `fn instanceof Object` not work? Can't I call `fn.getClass()`? What makes something a "Java object"? – Bergi Oct 03 '22 at 18:35
  • @Bergi Can I treat it like any other Java object? Does it store data and receive messages like any other object? Can I call methods on lambdas? Consider this: `Consumer method = (n) -> { System.out.println(n); }` Does that mean that lambdas are Consumer objects? Or is that syntax sugar for autoboxing a Consumer around a lambda expression? It might seem academic, but it's an important to distinguish between "everything is an object, we have syntax sugar to pretend they're not" (ie. Ruby) and "we have syntax sugar to treat non-objects like objects" (ie. autoboxing). – Schwern Oct 03 '22 at 18:58
  • @Schwern Oh so you're saying that in Java, a lambda expression is a primitive value that can get autoboxed into various objects (e.g. `Consumer`)? I did not know it works that way, I had assumed the syntax would compile directly into an object construction. Maybe I'm confused because there does not seem to be a primitive function type for these values, that would allow us to store them or pass them around, like there is with `int` or `float`. – Bergi Oct 03 '22 at 19:08
  • @Bergi How it compiles is an implementation detail. AFAIK Java lacks a lambda interface. Instead, [there are many different functional interfaces](https://docs.oracle.com/javase/8/docs/api/java/util/function/package-summary.html) which can be a target for a lambda "*Functional interfaces provide target types for lambda expressions and method references. Each functional interface has a single abstract method, called the functional method for that functional interface, to which the lambda expression's parameter and return types are matched or adapted.*" That says wrapper or sugar to me. – Schwern Oct 03 '22 at 19:25
  • 4
    @Schwern Sugar, yes, but sugar for *object* construction if you ask me. Thanks for the edit! But instead of saying "*Lambdas do not act like other Java objects*", I'd just put Java lambdas entirely in the syntax realm, not in the value realm. – Bergi Oct 03 '22 at 19:43
  • @Bergi Thus *lambda expressions*, not lambda objects. `1 + 1` is not an object, it is an expression which produces an integer. In contrast, Ruby lambdas are their own objects. [Consider this example](https://docs.oracle.com/javase/tutorial/displayCode.html?code=https://docs.oracle.com/javase/tutorial/java/javaOO/examples/Calculator.java). `IntegerMath addition = (a, b) -> a + b`. The lambda is an expression defining a method which matches the IntegerMath interface. Methods are not objects in Java, in contrast to Ruby where they are. (I'm not a Java expert, I'm sifting through this as well). – Schwern Oct 03 '22 at 19:50
  • 2
    @Schwern w/ regard to lambdas: "You can't treat it like a Java object. It's a special expression" I'm pretty certain that lambdas are compiled into objects. You can assign them to variables (as in your example.) I kind of avoid Java these days but it would be fairly straightforward to look at the bytecodes generated by them if you want confirmation. I think they are implemented as anonymous inner classes. But, again, I'm not sure that's relevant. A language's semantics and how it is compiled/executed are two separate things (ideally anyway.) – JimmyJames Oct 03 '22 at 19:54
  • You can call methods on Java lambdas, but just have to cast them to a `@FunctionalInterface` type first. – OrangeDog Oct 03 '22 at 19:55
  • @JimmyJames they're more complicated than anonymous classes. See the `java.lang.invoke` package. – OrangeDog Oct 03 '22 at 19:57
  • 2
    I think the most illuminating and interesting aspect of lambdas is that they can be used as a valid parameter for a compatible 'single-method' interface. For example, you can pass in a lambda to a method that takes a `Comparator`. I think that also works with compatible named method references too but I'm not 100% on that. – JimmyJames Oct 03 '22 at 20:01
  • @OrangeDog I'll take your word for it. You seem to be well-versed in the topic. I would just note that I think the question and answer are really about the 'philosophy of java' and not the details. Java is often unfairly maligned, IMO, but there's no debate about whether the concept of function references was part of the original plan. – JimmyJames Oct 03 '22 at 20:04
  • @JimmyJames Is the lambda an object which implements all the functional interfaces? Or is a lambda an expression which creates an object with the correct interface for the object? Or is a lambda an object which is cast to an object of the correct interface? AFAIK there is no lambda interface nor class defined in Java, just lambda syntax, which leads me away from "object" and towards "expression". To the point of the answer, this ambiguity and complexity is consequence of Java, and other C++ derived languages, mixing procedural and object design paradigms. – Schwern Oct 03 '22 at 20:30
  • @Schwern I understand the design to be that [function references](https://docs.oracle.com/javase/8/docs/api/java/util/function/package-frame.html) are primary and that lambdas are 'sugar' that creates them with minimal code. Again, I want to reiterate that I think you have a very good answer and that the particulars of how Java was implemented (while interesting to me) are not terribly relevant. – JimmyJames Oct 03 '22 at 20:41
  • @JimmyJames I think, you got the most important point: “*the particulars of how Java was implemented … are not terribly relevant*”. You can use lambda expressions to denote functions. They have the necessary properties. E.g., the lambda expression itself does not know anything about the interface you’ll use to carry it to the receiving side. Unlike anonymous inner classes, it does not have a namespace polluted with inherited members nor a confusing meaning of `this` and `super`. The performance overhead of carrying it as an object is eliminated by the JIT or AOT compiler. It does its job. – Holger Oct 04 '22 at 08:27
  • Absolutely Java lambdas are expressions. In Java context, "lambda" is a shortened form of "lamda expression". And of course you cannot invoke methods on such expressions themselves, any more than you can on any other expression. Nor can you in (say) Ruby or Python, either. But you absolutely *can* invoke methods *on the result* of evaluating a Java lambda expression. The result is an object. – John Bollinger Oct 04 '22 at 20:21
  • @JohnBollinger When you say "the result of evaluating a Java lamdba expression", do you mean like a Runnable? – Schwern Oct 05 '22 at 06:06
  • @Schwern yes, that’s [the formally correct language](https://docs.oracle.com/javase/specs/jls/se17/html/jls-15.html#jls-15.27.4). Evaluating a lambda expression results in an instance of a functional interface. When the function method is invoked on that instance, the body of the lambda expression is evaluated or executed. – Holger Oct 05 '22 at 06:40
  • No, @Schwern, I mean the result of *evaluating the expression*. This is the same concept as evaluating the expression `1 + 1` to obtain the value `2`. The result of evaluating a lambda expression is an object that has methods you can invoke. In particular, it has the method defined by some particular functional interface that is determined by the context of the expression (and the object's class implements that interface). – John Bollinger Oct 05 '22 at 12:17
  • Aren't lambdas inherently expressions, in any language? The entire **point** of lambda syntax is to provide a way of defining a function as an **expression** so it can be defined exactly where it is used, rather than defined elsewhere with a specific block that has to be linked to its usage by a name. I honestly don't understand the distinction you're trying to draw between Java's lamba syntax being an expression that creates an object of many different classes vs Ruby's lamba syntax being an expression that creates an object of the class `Proc`. – Ben Oct 06 '22 at 01:20
  • 1
    I find the term "true OOP" _horrible_, honestly. Just call those languages as multi-paradigm, as it is what they are. – T. Sar Oct 06 '22 at 14:53
  • 1
    I've removed the part about whether lambdas are expressions or objects. It's just a distraction from the main point: Java and C# are hybrid languages and that has consequences. – Schwern Oct 06 '22 at 21:36
26

This is a little bit of a silly question. You're asking why object-oriented languages are object-oriented. If they passed functions around as first class types we wouldn't describe them as object-oriented languages.

Why, from a language design point of view, do heavily object-oriented languages tend to lack the ability to reference a function on its own?

Because they believe that virtual dispatch (calling the most derived type's implementation of a function) is a better way to model polymorphic behavior than function arguments. "Better" here of course could mean any number of arguments, potentially including non-functional requirements like readability, ease of implementation, etc.

As has been covered to the point of parody, heavily object-oriented languages, such as C# or Java, tend to lack the feature of having functions as a primitive type.

At least for C#, this hasn't been true for nearly 20 years now. Delegates, and later lambda style anonymous functions are first class tokens which are given special treatment in the type system.

I wouldn't be surprised if there's already documentation of the costs that functional Java-like languages have paid by choosing the "functions don't need to be carried around by an object" path.

You mean unlike Javascript or Haskell or every other functional language that supports closures, which carry an object (the state closed over by lambdas) around with their function?

It's just a different model for the same thing.

Telastyn
  • 108,850
  • 29
  • 239
  • 365
  • 8
    Major praise for that final link. "Closures are just objects anyway" is an enlightening perspective. – J. Mini Oct 03 '22 at 18:54
  • java still requires an object inside the function. the function isn't an object by itself (contrary to, e.g., python). You can be object-oriented and have functions _be_ objects, not just be _part of_ an object – njzk2 Oct 05 '22 at 11:31
11

In these “classic OOP” languages like C++/Java/C#, objects are data + behaviour, where the behaviour is provided by a class. On a technical level, this generally results in a memory layout somewhat like:

using Method = void(*)();  // some function pointer

struct Class {
  Method methods[];
};

struct Object {
  Class* klass;
  Data data;
};

Example ASCII Art:

| object
v
+---+---+
|   |   |
+-+-+---+
  |
  v class 
  +---+---+---+
  |   |   |   |
  +---+-+-+---+
        |
        v
        some_method()

This kind of approach has some nice properties. In particular, we can inspect any object and retrieve its original type. In C# and Java, this also enables reflection for all objects.

A function by itself does not provide the same expected structure, and cannot be used interchangeably. If we want to use the function as an object, the natural approach is to create a wrapper class and wrapper object that then invokes the target function via a call() method or similar. In Java, this generally took the form of “anonymous classes” that could be declared and instantiated inline.

C# initially used a different approach via its “delegate” concept, that did support function pointers to be assigned but not as normal objects.

As these languages have evolved, it has become clearer that a pure OOP approach is often annoying and tedious, and that functional techniques can be quite attractive. So these languages have gotten better, more convenient syntax for this wrapper class approach, often in connection with their syntax for lambdas (anonymous functions).

  • C++ has std::function to turn function objects like literal lambdas into objects.
  • Java got its “functional interface” concept so that the compiler can auto-implement the necessary wrapper class when given a literal lambda or a method reference. Lambdas themselves do not have a type.
  • C#'s lambdas have multiple purposes – depending on the type context the compiler turns them into objects like Action, but in other contexts like LINQ they just serve as abstract syntax – they do not necessarily represent runnable code.

There is another relevant connection between objects and functions if you consider closures – nested functions that retain access to their surrounding variables. A closure is not just a pointer to some machine code, but also needs to keep references to those surrounding variables: a closure is a combination of data + behaviour, a closure is an object. It thus makes sense to recycle the language's facilities for representing objects.

Outside of this C++-style OOP bubble, there has generally been a far more flexible approach to first-class functions. Especially in dynamic languages (e.g. Smalltalk, JavaScript, Python, Lua), Functions and methods are often just ordinary objects. In these languages, that helps keep the data model simple. On a technical level, it makes less sense to create a separate class for each function (this doesn't carry useful type information here), and we might only have a single class for all functions. This class might provide a field for the target function pointer, and for captured variables. Something like:

struct FunctionObject {
  Class* class = &FunctionObjectClass;
  Method target;
  Object* captures[];
};

Class FunctionObjectClass = {
  .methods = { &FunctionObject_call },
};

Object* FunctionObject_call(FunctionObject* self, Object** args) {
  return self->target(self->captures, args);
}
amon
  • 132,749
  • 27
  • 279
  • 375
  • 2
    Your `struct Class {Method methods[]; };` is the vtable, right? In C++ implementations, that only exists in classes that have any `virtual` member functions. "[Standard layout](https://en.cppreference.com/w/cpp/types/is_standard_layout)" classes are guaranteed *not* to have any extra shenanigans like that, so the struct/class address is also the address of the first member. See also [How do objects work in x86 at the assembly level?](//stackoverflow.com/q/33556511) / [In C++, is it valid to treat scalar members of a struct as if they comprised an array?](//stackoverflow.com/q/73904491) and – Peter Cordes Oct 03 '22 at 06:01
  • 5
    Also, C++ has had "free" functions since the outset, inherited from C, and `static` member functions. You can take the address of a static or non-member function. But that's not OOP at all. It also has syntax for pointer-to-member-function (which require a `this` object to call it on); using `std::invoke` makes it easy to use those pointers on any object, mixing and member-function pointers and object pointers (of compatible types). https://isocpp.org/wiki/faq/pointers-to-members . But that was possible before C++11 (C++11 `std::mem_fn` and `std::bind` from `` make it easier.) – Peter Cordes Oct 03 '22 at 06:32
  • 3
    Anyway, C++ is *not* limited to Java's way of doing things, so it's weird to lump it in with them. You can pass around raw function pointers. Or if you want to do your polymorphism differently, you can roll your own, or for example use [`std::visit`](https://en.cppreference.com/w/cpp/utility/variant/visit) on `std::variant` objects by value, instead of needing references for polymorphism. – Peter Cordes Oct 03 '22 at 06:38
  • @PeterCordes This answer uses the words “object” and “class” in the OOP sense, not in the C/C++ sense that also uses these terms for value types. A function pointer by itself does not carry RTTI and is then not an object (and even in the standard C++ viewpoint, function pointers are not ordinary pointers). Similarly, C# delegates are not ordinary objects. I've included C++ here because it is useful for contrasting with Java/C#. For example, `std::function` is a library-level implementation of the language-level type erasure Java does with its functional interfaces. – amon Oct 03 '22 at 09:56
  • (So there is another answer to be written about how function pointers are not ordinary pointers, if we consider the existence of Harvard architecture computers. A portable language like C++ or Java MUST treat data and code as distinct categories, but can use the aforementioned techniques like function objects to smooth over that difference. OP correctly points out the existence of Lisps as an example where data and code are interchangeable, but that is an abstraction.) – amon Oct 03 '22 at 10:00
  • 4
    You're arguing that a C++ class isn't object in the OOP sense if it doesn't have virtual member functions? I'm not an expert on OOP theory, but you don't *always* need polymorphism for every type to be doing OOP, do you? So you can have objects, especially helper or sub-objects but not limited to that, where the static type uniquely determines the member function being called, not dispatch table needed. That's a special case of OOP where polymorphism isn't needed. Partly this is a nitpick; your overall point does work that this is C++'s traditional way to do OOP style *polymorphism*. – Peter Cordes Oct 03 '22 at 10:24
  • 4
    Function pointers are different from data pointers, yes, but having "functions as a primitive type" doesn't imply that they should be *mutable*. In fact having them as a separate thing from data references or integers makes them even more of a first-class citizen on par with those things, not a special case of another primitive type. As you say, C++ is designed to be ahead-of-time compiled, only executing machine-code that was generated before the program started running. Possibly from ROM. So it's compatible with pure Harvard machines. – Peter Cordes Oct 03 '22 at 10:31
  • 1
    @PeterCordes Under any mainstream mechanical OOP definition I know (e.g. Kay's method passing / extreme late binding; Encapsulation–Inheritance–Polymorphism; OOP is when dynamic dispatch), virtual methods or equivalent runtime polymorphism are indeed the defining feature of OOP. In static type systems, this also implies that we only get OOP if we interact with an object through an interface/base class. This means that a C++ class/struct alone does not provide OOP. There are other definitions like “OOP means modelling the software after real world concepts” that do not rely on dynamic dispatch. – amon Oct 03 '22 at 11:59
  • And yes, I agree that C/C++ function pointers are primitive types. So for the specific case of C++, OP's question could be answered “C++ is multi-paradigm, function pointers are available”. But Java and C# used a different design where wrapper objects allow the overall language to be simpler, and even C++ frequently uses such wrapper objects (e.g. for lambdas or bound methods, requiring type erasure per std::function). – amon Oct 03 '22 at 12:01
  • 1
    @PeterCordes: The decision that C++ classes are not "objects in the OO sense" if they don't have virtual functions is pretty intentional. RTTI in C++ only works for classes that have at least one virtual function. C++ has objective distinctions between primitive types, OO class types and non-OO class types. `` even tells you what's what. – MSalters Oct 03 '22 at 13:00
  • *There are other definitions like “OOP means modelling the software after real world concepts” that do not rely on dynamic dispatch.* - Thanks, yes, this was the meaning of OOP I was familiar with. A definition of OOP that requires every object to have dynamic dispatch would indeed require virtual member-functions. – Peter Cordes Oct 03 '22 at 19:57
  • @MSalters: Many aspects of C++ could have been clearer if all types declared as `struct` had to be standard layout types, while compilers would be free to do anything they wanted with the layout of types declared as `class`. That would have made it possible to use RTTI on types without virtual members, provided they were declared as `class`. – supercat Oct 05 '22 at 17:29
4

In later C# versions you no longer have to be explicit about delegates. You can pass a function just by stating its name.

I think it is mainly about type safety. In the compiled code there won't be an intermediate object. But on the language level one has to be explicit about the signature of any method, about the arguments and return type, so the compiler can tell if everything checks out when you invoke a method with arguments. If there would be just one function type, this would not be possible.

Your use of the term primitive is confusing to me. Primitive types are things like bytes and floats. Perhaps you mean native (to the language)?

Martin Maat
  • 18,218
  • 3
  • 30
  • 57
  • 3
    It doesn't need to be one type that allows any parameter. I think OP confused primitive types and first-class citizens – Jimmy T. Oct 03 '22 at 14:45
4

The use of "primitive" is a bit too broad in the question.

In general terms, primitives are things that have simple, measurable, binary storage, without iteration or growth in allocated space (number, byte, boolean, pointer etc). These are basic building blocks of program data, supported by hardware. We can compose complex data types from these.

Considering a function type primitive, really stretches the definition above. A function is a chunk of instructions sitting somewhere in the process memory. A pointer can be used to address it and the pointer itself, can be stored as a primitive.

Apart from this distinction, the everything is an object or everything is a function is just one way of thinking about programming.

Something can be a container of data, the data itself, and can also be executed, all at the same time. JS functions are a close example of this.

My question is this: Why, from a language design point of view, do heavily object-oriented languages tend to lack the ability to reference a function on its own? Why must they be carried around by a companion object?

A function type, at compile time, in a strongly typed language, is not simple or primitive at all. Function types have to deal with signature information. There simply cannot be a single Function type for all kinds of functions one may write, without loosing the signature information.

The heavily object-oriented languages mentioned in the question have this exact shortcoming. They lack the proper ways to define and consider function signatures as first class, standalone types. In-fact, they are behind many other languages in terms of Type expression.

Languages like Typescript allow users to define pretty much any Type Signature as first class Types (aliases) without the need of any interface or class definition:

// dear compiler, this is type of a function that adds two number like things
type add<T extends number> = (x: T, y: T) => T;

// and this, type of a function that compares two compare-ables
type compare<T extends Comparable> = (x: T, Y: T) => number;

Type checking has no need to depend on Object Orient-ism of a language. Types and signatures are just constraints that compiler makes use of.

Unless Java etc go the Typescript way, they have to resort to saving the signature information glued to the enclosing type, a Class. And thus, the inevitable coupling with objects.

Still, such languages have come a long way to minimize the companion type syntax. You can declare variables to be of lambda interface types (Java example):

Supplier<String> fn1 = () -> "Hello World";
Consumer<String> fn2 = System.out::println;

But it is still a long way to go (Java 17 example):

var print = System.out::println;
// fails with: method reference needs an explicit target-type
S.D.
  • 957
  • 6
  • 16
  • 1
    What exactly do you mean by "*go the Typescript way*"? – Bergi Oct 04 '22 at 20:55
  • yes, but `Consumer` is still an object, not a function. You have to reference the function in it, to explicitly call it. – njzk2 Oct 05 '22 at 11:33
  • @njzk2 That's more an artefact of Java's decision to not open up defining operators on user-defined-types. One can imagine a world where `Consumer::accept` was instead spelled `Consumer::operator()`, and the *definition* of a function was "a object with an `operator()`" – Caleth Oct 06 '22 at 15:20
  • @Bergi Typescript provides constructs for advanced type declaration, user is not forced to declare interfaces or classes only. – S.D. Oct 06 '22 at 17:12
  • @njzk2 Yes it is of functional/lambda interface type. Its still an interface type, requiring `apply`, `get` etc to be called on it. – S.D. Oct 06 '22 at 17:16
2

Without delving into any specific language (yet!), one first reason would probably be ease of implementation (in the compiler/interpreter). It is much much more simple to have a single "internal" data type of objects to pass around, than having to have two different ones.

Functions are a relatively complex data type, compared to integers, strings and whatever "native" data types a language usually has. They have arguments, return values, possibly other attributes. All of that needs to be represented somehow. Then, for a strongly and/or statically typed language like Java, to make a function a real first class citizen, you'd have to implement some kind of inheritance concept for functions; i.e. the same way you can assign any object to a variable of the type Object, but not to a variable of the type Person, you'd probably want to have some feature of this kind for functions as well. Be it some templating scheme or whatever you can think about. Purely functional languages do not have that problem because they start with a concept for exactly this, and then model everything else around it.

Finally, there is nothing wrong with representing a function as an object! In a good object-oriented language, everything is an object, including a function, and everything will then flow together nicely. For example, in Ruby, this is exactly the case: there is nothing that is not an object. Classes are objects. Modules (stateless collections of methods) are objects. All "native" data types like integers etc. are objects. You can define functions dynamically/anonymously and pass them around like variables - because they are also objects (which happen to have a call method).

This means you can write code like this (this is an interactive session, but you could use the same code in actual programs):

# Define a method (lambda/proc in Ruby terms) and assign it to variable fn
irb(main):001:0> fn = -> (arg) { puts arg }         
=> #<Proc:0x007faf00413398@(irb):1 (lambda)>

# Call the function (there are syntactic sugars for that if one prefers to make it look like a "normal" method call)
irb(main):002:0> fn.call("hello")
hello

# Ask about the arity
irb(main):003:0> fn.arity
=> 1

# Ask where the definition (the code) is
irb(main):004:0> fn.source_location
=> ["(irb)", 1]

# Ask for the parameter definition
irb(main):005:0> fn.parameters
=> [[:req, :arg]]

# Use it in an array
irb(main):024:0> arr = [1, 2, 3, fn, 4]
=> [1, 2, 3, #<Proc:0x007faf00413398@(irb):1 (lambda)>, 4]

# Ask which class the object has
irb(main):012:0* fn.class
=> Proc

To stick with this, if you check out the documentation for the Proc class you see that there are a lot of very interesting, cool and powerful ways that open up once you make functions first-class-citizen objects (here, Proc includes a closure (or "binding" in Ruby terms). You can literally do whatever you want with it, including extending the definition of the Proc class to add more features on the fly. Or you could create an instance of class Method which is an object that represents a method call on a given object and otherwise behaves very similarly to a Proc.

If at all, I would ask why these older languages you mention do not have a full-fledged object-representation of functions and methods - and the answer would simply be that that was not en vogue when those languages were designed, or there were pre-existing constraints which would have made it too hard. Languages develop over the decades. Modern versions of Java (and presumably C#) have added more functional features, but it is completely understandable why they shied away from it at the beginning - people were still very used to the silly low-level function-pointer-hell from C or old C++.

AnoE
  • 5,614
  • 1
  • 13
  • 17
  • 2
    "*[functions] have arguments, return values, possibly other attributes*" - did you mean parameter and return *types*? Only a function call has concrete argument and return values. – Bergi Oct 04 '22 at 20:57
0

Java and C# implement instance methods under the hood as being equivalent to static function with an extra hidden "this" parameter of type "Object", which will be automagically converted to the type of the object containing the method. At least in C#, it's possible to create a delegate for a static object which does not have such a parameter. The first time this is done for any particular function, the system will generate code for a dummy function which accepts the extra parameter but ignores it, and then chains to the static function. Additional requests to create delegates for the same static function will simply reuse the same wrapper. This allows delegates to be invoked without the system having to know or care about whether the target is an instance method or a static method. When invoking a static function through a delegate, calling through the wrapper would be at most slightly more expensive than examining the delegate and determining there was no need to pass "this", but testing whether a delegate invocation requires passing "this" would increase the cost of the common case where it does.

supercat
  • 8,335
  • 22
  • 28
0

In Objectice-C, (blocks or closures, called lambdas in other languages) are first class objects. They can be stored in variables, arrays, sets, dictionaries, or as instance members, they can be member variables of a class, and they are reference counted.

Differences to most objects are that closures can be called (you can’t call a String, for example), and there is an optimisation in Objective-C that closures which do not actually capture anything have only one instance, with all information in static global data, and reference counting does nothing.

Apart from that, full objects. So you can write a method that tells an object which closure to call before or after any closures that it would call anyway when there is some event, by adding the closure to an array of closures to be called.

gnasher729
  • 42,090
  • 4
  • 59
  • 119
-1

I think the premise is wrong, certainly for C#, but the answer is quite simple.

Addressing

If we could declare functions without some kind of containing structure, then how would we address the individual functions? How do you handle the case of importing references from multiple sources if they all define top-level functions that have identical names?

Javascript is a good example that deals with this by only allowing access to the last implementation of each function of the same name, which makes the sequence of the imports critical. But ultimately this is a very ambiguous process for the developer to know in advance what functions might be affected by this and what the correct sequence might be.

The issue is simply that in strongly typed languages we have established rules and conventions around the addressing of types and functions, one of those rules is that function must be contained within a class or type, and that class has a fully qualified namespace that can be used to reference it, by extension we can now reference that function.

In these languages, like C#, Delegates, lambdas, Closures and other forms of functional expressions can be declared but must still be assigned to a variable that is contained within a scope that is ultimately contained within a class. That class could be static and allow access to the function without instantiating an object from that class definition, but it must still be declared within the scope of a class so that we can reference it and importantly differentiate specific functions from each other that might have the same or similar names.

Chris Schaller
  • 276
  • 1
  • 8
-2

When "a function is a primitive type," IMHO you're basically going back to the bad old days of "C."

Well, a fundamental characteristic of the "object oriented" paradigm is that the surrounding infrastructure is now stronger and more aware. "Executing code" is always associated with 'some object,' variously called this or self, and is "a method or property" of that object.

If we were still to retain the idea of "a pure-function as a primitive type," we would be unleashing a bull in a china shop. We would be introducing, as they say, "out-of-band behavior." Something that isn't aware of the object-oriented runtime framework and which therefore could very easily disrupt it. (IMHO ...)

Mike Robinson
  • 1,765
  • 4
  • 10
  • 7
    What does this even mean? A function isn't going to mess up the runtime; it's not like we're talking _inline assembly_. – wizzwizz4 Oct 03 '22 at 09:35
  • 2
    This makes no sense. A trivial bit of executing code, like addition, has no idea of "some object". It works on **two** objects. A less trivial example like Greatest Common Divisor shows that this isn't restricted to some arbitrary set of "trivial operations". And an algorithm like Set Union shows it's not even specific to some arbitrary set of "trivial types". There are textbooks full of provably correct algorithms which all debunk the flawed idea in this answer. – MSalters Oct 03 '22 at 13:06
  • I'll take these criticisms and now respond to them. (And yes, these are just my opinionated opinions as are yours.) "Simple functions" are one thing – every language on earth of course has them – but "functions *as a primitive type"* are something different. At least, as I understand the term. In fact ("as I understand the term") OOP very much embraced the idea of "functions as a type," but bound them firmly to the object model. Perhaps it is simply a difference of understanding and/or terminology. – Mike Robinson Oct 03 '22 at 19:45