What is late binding?

Question

I know, there are lots of sources on the internet, but I do not understand them.

Wikipedia: "Late binding, dynamic binding, or dynamic linkage is a computer programming mechanism in which the method being called upon an object or the function being called with arguments is looked up by name at runtime".

I do not understand, how can function be looked up by name at a runtime? It is always looked up by address.

Example:

class a{
public:
virtual void b(){
    int c = 40845343;
}
};
int main()
{
    a d = a();
    d.b();
    return 0;
}

The function "b" and the calling of it is compiled into the following:

CALL 00438548

...

00438548: 
PUSH EBP
MOV EBP, ESP
SUB ESP,14
MOV DWORD[EBP-14], ECX
MOV DWORD[EBP-4], 26F401F
NOP
LEAVE
RETN

The function is called by address, not by name.

How to understand, what late binding is?

Add some inheritance and virtual functions and see what happens. — Philip Kendall, Feb 25 '19 at 18:36
Be aware that compilers also optimize. There is only ever one function definition that can be called to satisfy `d.b()`, so the optimizer ignored any late binding and reduced the code to the fastest running implementation. You need to give the optimizer a reason to leave the code alone. — Berin Loritsch, Feb 25 '19 at 18:39
Seeing the tags you added to your question, I guess your issue is you are way too focussed on C++ and its standard mechanism for calling functions. "Function lookup by name" can be way easier understood by looking at programming languages which have direct support for this, like Python, VB.NET or modern versions of C# (Java and older versions of C# support this by reflection). — Doc Brown, Feb 25 '19 at 19:21
In your example, it is trivially easy to prove that there can only be one single possible binding, therefore any compiler worth its salt will optimize the late binding away and emit an early binding. — Jörg W Mittag, Feb 25 '19 at 19:22
See Wikipedia on [Eval](https://en.wikipedia.org/wiki/Eval). — Doc Brown, Feb 25 '19 at 20:59
"In C++, late binding (also called "dynamic binding") refers to what normally happens when the virtual keyword is used in a method's declaration. C++ then creates a so-called virtual table, which is a look-up table for such functions that will always be consulted when they are called. Usually, the "late binding" term is used in favor of "dynamic dispatch". More here:https://en.wikipedia.org/wiki/Late_binding — NoChance, Feb 27 '19 at 13:22

Erik Eidt · Answer 1 · 2019-02-25T19:38:01.157

Dynamic dispatch in C++, Java, and C# is done via tables — that map a slot position or index to a function pointer — rather than by name. Names are resolved at compile time, and they are assigned slots in tables as needed. In your case, virtual b introduces a virtual method, which allocates a new slot in what's called the vtable (virtual dispatch table) for class a. An override of b in a subclass of class a would share that same slot (the index/position): it is that slot sharing that implements dynamic dispatch in these languages. The actual instance type determines which table to use and the slot position for lookup comes from the particular method being invoked.

Your example is too simple, however. The variable in question has an exact type, so the compiler knows up front which method to call. While it could use the dynamic dispatch mechanism, there's no reason to do it here since there is only one possible answer of what function to call. And while the larger program does not necessarily preclude subclasses of class a, the compiler knows that the type of variable d is exactly class a and not any such subclass. Further, such compiler optimization can be done with local analysis rather than requiring any complicated whole program analysis.

Perhaps if you wrote a function that took a reference or pointer to an a and then called b on that, the compiler would have to allow for the possible existence of other subclasses, and as well that such parameter could be one of those subclasses, not just an exact a instance.

(However, be aware that compilers can use stubs as well, and these stubs can hide some of the dispatch from the direct line of generated code in the method, possibly making them look like direct calls by address at first glance. While not the case in your example, I just mention that things aren't always as the seem.)

Since you mentioned C#: since C# 4.0, there exists the type `dynamic` which allows name lookup of methods, which do have have to be known at compile time. This feature was probably inherited from the VB.NET world, which inherited it from VB6 and COM. — Doc Brown, Feb 25 '19 at 20:54

John Wu · Answer 2 · 2019-02-25T23:46:23.867

There are several types of binding and "late binding" may refer to several different things, depending on the technology and the context. It may be helpful to avoid the word "late" (and its opposite, early binding) and focus on specific types of bindings.

Compile-time binding means that a symbol or entry point is bound to a specific address when the code is compiled. This is actually pretty rare because addresses tend to be moveable in modern operating systems. It common mostly in firmware and low level type code projects.

Load-time binding means that a symbol or entry point is bound when then code is loaded into memory. The symbol is typically associated with an offset that is added to a base address chosen by the operating system. This type of binding is what people usually mean when they say "early binding" when talking about c++. This type of binding is very, very common in older binary executables.

VMT binding means that calls to a method will go through a run-time virtual method table or VMT. A copy of the table is associated with the class, and each object reference refers to the copy. When you call an object's methods, the runtime locates the appropriate table for the object, then locates the appropriate entry in the table for the method (not necessarily by method name; it is usually by ordinal position). By using the table, an object in C++ can mix and match addresses from a base class with a derived class, allowing for run-time polymoprhism. This is what object-oriented folks mean when they say "late binding."

Dispatch binding. In a COM (Component Object Model) context, "early" binding actually refers to VMT binding (which is called late binding outside the COM world). True early binding is not possible with COM. The term "Late binding" in a COM context refers to dispatch binding, a process where the entry point is retrieved by interrogating an interface with the symbol name.

Reflection is another type of binding in .NET applications where the entry point is determined by interrogating the type system with the symbol name. This type of binding might be thought of as "late late binding" because it is even less efficient than traditional late binding.

All that being said, I believe people tend to use the term "late binding" to refer to a situation where a mismatch between type and call does not cause a compile-time error, i.e. they mean any type of binding that could fail at run-time. Any of the above could fail at runtime other than compile-time and load-time binding.

score 0 · Answer 3 · answered Feb 27 '19 at 02:56

First, let's start with a short demonstration of what late binding is, does, and accomplishes:

#include <iostream>

class base {
public:
    virtual void f() { std::cout << "base::f()\n"; }
    void g() { std::cout << "base::g()\n"; }
    virtual public ~base() {}
};

class derived : public base { 
public:
    virtual void f() { std::cout << "derived::f()\n"; }
    virtual void g() { std::cout << "derived::g()\n"; }
};

int main() { 
    base *b = new base;          // first case: pointer to base, base object
    derived *d = new derived;   // second case: pointer to derived, derived object
    base *bd = new derived;       // third case: pointer to base, derived object

    b->f(); // invokes base::f
    b->g(); // invokes base::g

    d->f(); // invokes derived::f
    d->g(); // invokes derived::g

    bd->f(); // invokes derived::f
    bd->g(); // invokes base::g    
}

So in this case, note that f is a virtual function, and g is a non-virtual function.

When we invoke g, the function that gets invoked is determined by the type of the pointer (or reference) we start from. So, if we use a pointer to base, we invoke the base class function, and if we use a pointer to derived, we invoke the derived class function. So, being non-virtual, g is early-bound. That means the function that will be invoked is determined at compile time based solely upon the type of the pointer being used.

The first two cases with f aren't much more interesting. If we have a pointer to the base class pointing to an object of the base type, when we call f() we (of course) get base::f(), just like we did with g(). Likewise, when we have a pointer to the derived type, and we invoke f(), we get the derived::f(). Not much to see yet.

The third case is where things get interesting: we have a pointer to the base type but it's referring to an object of the derived type. As noted above, when we invoke g(), the compiler determines the function to invoke based on the type of the pointer, so we invoke base::g(). But with f() (because it's marked virtual) we get late binding. The compiler determines what f() to call based not on the type of the pointer, but instead on the type of the object it points at. Even though we have a base-class pointer, we invoke the function in the derived class, because we're dealing with an object of the derived class.

As to how this is done: at least in C++, it's normally done using a vtable. When you invoke a member function, your function receives a hidden parameter named this. If the class defines (non-static) member variables, those variables are accessed by taking an offset from the address in the this pointer--i.e., a pointer to the data storage for the object. If the class contains at least one virtual function, the object will also contain a (hidden) pointer to a vtable for that class. So let's consider code like this:

class A {
    int x;
public:
    virtual void foo() { std::cout << "base::foo()\n"; }
    virtual void bar() = 0;
    virtual ~A() {}
};

class B : public A {
    int y;
public: 
    virtual void bar() { std::cout << "Derived::bar()"; }
    virtual void baz() { std::cout << "Added function"; }
};

int main() { 
    A a;
    B b;
}

This will typically be laid out in memory something like this:

So, when we invoke member function, it receives this as a hidden parameter. That points to the a or b object in memory. When we invoke a virtual member function, the compiler generates code to get the correct offset in that object for the vtable pointer. Then it dereferences the vtable pointer to get the vtable for the type of object being referenced. Finally, it looks at a specified offset in the vtable to get to the correct function being invoked.

What is late binding?

3 Answers3