8

return this (or similar construct) allows method chaining. Lack of it is painful, because you have to write such code (C#):

var list = new List<string>();
list.Add("hello");
list.Add("world");

instead of

list.Add("hello").Add("world");

Elixir solves it nicely for function chaining, instead of relying on callee it relies on caller (forgive me my mistakes, I don't know Elixir):

list |> add("hello") |> add("world");

But now I have just read this sentence at wikipedia:

Returning an object of built-in type from a function usually carries little to no overhead, since the object typically fits in a CPU register.

On one hand callee does not know if the result will be used or not, on the other hand caller cannot stop callee from setting the result value. So I am skeptical about this "little", but "no overhead"?

Thus MY QUESTION for this very particular pattern (i.e return this with method chaining) -- can it be optimized with no overhead? How?

Question by example -- say I will write a framework and sprinkle every possible method with return this just to give ability for method chaining. The question arise -- will user who does not use method chaining will pay the price of lowered performance? How compiler could optimize code that this feature will have zero cost.

Update after first 2 comments -- "cheap"!="free", so maybe another perspective for my question, why the difference "little" vs. "no cost". If it can be guaranteed it is at no cost, we write "no cost", period. So I assume it cannot be guaranteed, thus "little".

Clarification I am not asking how to make another Elixir-like syntax in other language. I am asking how it is possible for compiler to optimize callee-caller interaction on return this + method chaining (or lack of it, when not used).

greenoldman
  • 1,506
  • 1
  • 14
  • 27
  • 2
    Actually, I think fitting in a CPU register is for free. – Alex Terreaux Jan 27 '16 at 16:23
  • 2
    returning an object reference is very cheap. – Erik Eidt Jan 27 '16 at 16:25
  • You ask an interesting question since if `foo.Bar(); foo.Qux();` is equivalent to `foo.Bar().Qux()` then you already have the object at the point you are going to invoke it again. However, if you look at [this example](http://programmers.stackexchange.com/a/289442/40980), even with the `return this;` there is type information that may be associated with the call that is not trivial to work with. This (pun not intended) needs some further investigation. –  Jan 27 '16 at 17:49
  • 4
    Are you doing this 1000s of times in a loop for a 3D video game? If not, the extra nanosecond is irrelevant. Downvote. – user949300 Jan 27 '16 at 18:32
  • 1
    Are you aware the wikipedia quote has nothing to do with fluent interfaces? – Winston Ewert Jan 28 '16 at 03:40

8 Answers8

10

First off,

var list = new List<string>();
list.Add("hello");
list.Add("world");

Is just as, if not more readable than

var list = new List<string>().Add("hello").Add("world");

Lines of code is not, in any way a proxy for code's cleanliness or simplicity.

Thus my question for this very particular pattern (i.e "return this" for method chaining) -- can it be optimized with no overhead? How?

So theoretically, sure. This would be similar to tail call optimization, where you wouldn't need to completely unwind the stack frame when moving between functions, you could leave the this argument where it is when you're done (or just peek rather than pop it in a stack based model). The challenge comes that you can only do that if you know the next call will be a chain. Since functions don't really know about their callers, they don't know if they should clean up after themselves or not. You could have some internal flag, which the function could check to know which path to take, but that would be more expensive than just passing in this all the time.

The compiler could also make two versions of the function, and pick the right one at compile time. That would bloat the resultant executable, slow compilation time, but probably shave an instruction or two from actual runtime.

Telastyn
  • 108,850
  • 29
  • 239
  • 365
  • Thank you. Let me ignore the opening paragraph, because it shouldn't happen in the first place (I hope you agree after some thought) -- and I skip to the meat part :-) That is my concern, "internal flag", keeping something in register, two functions, it is all that comes with the cost. It is not that hard to imagine a function, which ends up with some computations irrelevant to `this`, thus even such simple operation as pushing it back to register is a cost. – greenoldman Jan 27 '16 at 18:23
  • @greenoldman - I'm sorry, I do not follow. – Telastyn Jan 27 '16 at 18:29
  • The second part? I mean you explained it well, how can it be done, to minimize the cost, but it is still not a "no cost" price for the feature. – greenoldman Jan 27 '16 at 19:25
  • 2
    @greenoldman - compared to what? There is _always_ a cost for something, otherwise it wouldn't be a thing. – Telastyn Jan 27 '16 at 20:42
  • Compared to Elixir approach -- i.e. everything is on caller side. As for cost that is why I am asking, wikipedia claims "no cost", and for me it is crucial to know if it is possible to optimize `return this` approach to no cost if not used. – greenoldman Jan 28 '16 at 10:37
  • @greenoldman - Elixir is just syntax. There is still cost there, just as much as method chains in C#. They both have to compile down to the same thing... – Telastyn Jan 28 '16 at 12:34
  • so it is literally no cost? Because Elixir does not rely on callee, only on caller. If you state that it is compiled to the same thing, it means, that `return this` with no method chaining is optimized in such way, that there is absolutely no overhead. How this can be done? – greenoldman Jan 28 '16 at 20:02
  • @greenoldman - "does not rely on callee, only on caller" - what do you mean by this? That it does not care about the return value? `list.Add("Hello"); list.Add("World");` is the equivalent C# code in that case. `|>` doesn't eliminate the need to grab `list` and pass it into the function call - it's just sugar so you don't have to type that again. – Telastyn Jan 29 '16 at 00:57
  • Exactly, so when I call `Add` **once** in C# case, compiler does not need to keep `list` in register after call, because it is no longer used. With `return this` it would have to pass it back, because **maybe** it will be used. – greenoldman Jan 29 '16 at 07:51
  • @greenoldman - practically, compilers for any (non-continuation passing style) language remove the arguments from their registers at the end of their functions. "Used" or not does not matter. C# or Elixir makes no difference. – Telastyn Jan 29 '16 at 12:43
  • So considering two functions `foo` with only difference in `return this` at the end (one has it, the other not), the performance of those functions is **exactly** the same. Correct? So, how compiler can optimize `return this` in such way it is not executed at all -- after all `this` argument does not have to be stored at the register at the point of exit, so it might have to fetch it back from memory. – greenoldman Jan 30 '16 at 07:49
  • @greenoldman - Not correct. `return this` is an extra step. `after all this argument does not have to be stored at the register at the point of exit` - but **it does**. The `Add` function can't know how it's being called, so can't assume the return value doesn't matter. Which leads back to my answer. – Telastyn Jan 30 '16 at 14:17
  • Which leads to *" "Used" or not does not matter."* is false. Because in `return this` approach `this` has to be returned always (extra step), while in Elixir approach the caller argument is passed in chain depending whether there is chain at all. If not, it is not passed to next step because there is none. Thus `return this` approach is less efficient. Thus C# or Elixir **does** matter because C# does not have mechanism of direct caller passing. – greenoldman Jan 30 '16 at 14:40
  • 1
    @greenoldman - Look. Go, decompile that Elixir code. I will bet you money that each call to add returns the list, and it is then re-passed into each step in the chain. There is nothing magical happening there but reordering of the arguments. – Telastyn Jan 30 '16 at 15:03
  • Ah, my mistake with giving Elixir example, I incorrectly thought it uses fixed LHS expression for all consecutive calls, while in fact it "repacks" `return this` approach with different syntax (so the difference is syntax only). Sorry for that. – greenoldman Jan 31 '16 at 08:36
3

At least in C#, your desired construct is entirely unnecessary.

You can write something like

var mylist = new List<string>(new[] {"Hello", "World"});
...
mylist.AddRange(new[] {"Add", "Some", "More", "Items"});

which is about as concise as you can get, while still being perfectly readable.

If you really want method chaining in a class, even one that you don't have the source for, you can use extension methods.

public static class ListExtender
{
    public static IList<T> ChainedAdd<T>(this IList<T> list, T item)
    {
        list.Add(item);
        return list;
    }
}

then

 mylist.ChainedAdd("x").ChainedAdd("y").ChainedAdd("z");
Peregrine
  • 1,236
  • 1
  • 7
  • 9
  • t seems there is misunderstanding (see my clarification), I am not asking how to make syntax clearer or anything like this. – greenoldman Jan 27 '16 at 18:16
3

Returning an object of built-in type from a function usually carries little to no overhead, since the object typically fits in a CPU register.

The point of this comment from wikipedia isn't that returning the value has no cost. Overhead is additional cost, and the here the cost is in addition to that of the return value mechanism. The article is contrasting additional work required for objects to have be copied when they are returned.

Thus MY QUESTION for this very particular pattern (i.e return this with method chaining) -- can it be optimized with no overhead? How?

In general, no. In particular cases (such as where the function can be inlined) it can be.

Winston Ewert
  • 24,732
  • 12
  • 72
  • 103
3

can it be optimized with no overhead? How?

As Telastyn wrote, one approach is to have a compiler providing two function versions. If I were in the role of a compiler designer, I would handle it this way:

  • I would build a compiler with inlining support. Such a tool can obviously optimize the return this statement out when it is not used.

  • and if this hypothetical compiler decides not to inline a specific function, optimizing out the machine code equivalent of return this is definitely not worth it (if it would be, the compiler would inline).

So this does not lead to "zero overhead" in a pure mathematical sense, but to "zero overhead" for any practical purpose.

Doc Brown
  • 199,015
  • 33
  • 367
  • 565
  • Thank you very much! And yes, I am here very strict, because no matter how small fraction of time any op is, time sums up (except for zero). So for now I assume wikipedia was not strict there ("no cost"). – greenoldman Jan 28 '16 at 10:39
  • @greenoldman:*"no matter how small fraction of time any op is, time sums up"* -.this is wrong when you do not measure absolute time but relative time. If a function call becomes slower for 0.001%, calling the same function a million times won't change this. If a program runs 10 hours or 10 hours + one second does not matter for any practical purposes. – Doc Brown Jan 28 '16 at 21:38
  • I assume you estimated 1 million calls * 0.001% of the function execution time = 1 second, correct? – greenoldman Jan 29 '16 at 07:55
  • @greenoldman: I did not really "estimate", I was just throwing some numbers into my comment to illustrate the statement. – Doc Brown Jan 29 '16 at 15:32
2

If the implementation uses a calling convention that uses the same register to pass the this pointer to a function that it does to return a value from the function, then return this is a zero-cost operation (because the return value is already there).

dan04
  • 3,748
  • 1
  • 24
  • 26
  • 1
    Good point, thank you, but it would mean either reserving the register for it (practically impossible) or to set the register just before return just to satisfy the convention, and the caller, which could not use this data. – greenoldman Jan 28 '16 at 10:42
2

A compiler could guarantee that the return this + function chaining pattern has never worse performance than calling multiple methods on the same object, by always rewriting the former pattern into the latter.

But I think it would be a relatively complicated rewrite and I'm not sure it would be worth the effort. Especially since I believe the return this pattern has a potential to be more efficient, because it means that this doesn't have to remembered by both the caller and the callee: the caller can forget the object while the chain is executing.

Note that the compiler is unlikely to recognize this. So, to take advantage of this (theoretical) optimization, instead of writing:

var list = new List<string>();
list.Add("hello").Add("world");
foo(list);

You would write:

foo(new List<string>().Add("hello").Add("world"));
svick
  • 9,999
  • 1
  • 37
  • 51
  • Thank you, but you forgetting something "I believe the return this pattern has a potential to be more efficient". It cannot be even more efficient than no `return this`. – greenoldman Jan 28 '16 at 19:58
  • @greenoldman It can, I tried to explain this: with `return this`, the caller doesn't have to remember the object, thus saving one variable/register, which can increase performance. – svick Jan 28 '16 at 20:01
  • 1
    @svick But the *receiver* does need to save the value, which may decrease performance. – Jules Jan 29 '16 at 07:55
  • @Jules Sure, I was objecting to the assertion in the question that `return this` is always worse. I'm just claiming that in some circumstances, it can be better, not that it's always better. – svick Jan 29 '16 at 13:59
1

Dart has this feature. So you can do

StringList myList = new StringList()
    ..Add("hello")
    ..Add("world");

Here we can treat F(expr..m(...)) as syntax sugar for:

var _generated_temp = expr;
_generated_temp.m(...);
F(_generated_temp);

You can see Dart also allows setting properties to be chained in the same way:

Person myPerson = new Person()
    ..Name = "Jane Doe"
    ..Age  = 31;

C# gives you a subset of these features with initialization syntax. So you can translate the above into

StringList myList = new StringList { "hello", "world" };

Person myPerson = new Person{ Name = "Jane Doe", Age = 31 };

However, this syntax sugar is available only at initialization and not available for arbitrary method calls, so it is strictly less expressive than Dart's version.

So it is possible to offer fluent interfaces as a language feature, which would remove any performance cost. However, the performance cost of a fluid interface is so negligible as to be effectively nil, so one should not argue for it on those bases. (Note that it might be in theory possible for non-virtual methods that just return this to be optimized away. This optimization would not be worth performing.) Rather, the benefit would be the opportunity to use a fluent interface even when it was not designed for.

walpen
  • 3,231
  • 1
  • 11
  • 20
  • It seems there is misunderstanding (see my clarification), I am not asking how to make syntax clearer or anything like this. – greenoldman Jan 27 '16 at 18:15
  • This isn't really about making the syntax clearer. This is about adding a language feature that would make the use of the "return this" idiom unnecessary, thus avoiding any need to optimize it. – Jules Jan 29 '16 at 07:50
0

In theory, yes. C++ has a term for it: copy elision. The C++ standard allows for copy elision. The concept you are asking about, in fact, has its own tla: RVO (return-value optimization).

Essentially, the compiler can (but, unfortunately, is not required to) decide on the space where the return value will reside before the function is called. This avoids pushing the return value on the stack and then copying it to this space. The compiler can generate the function code in such a way that instead of changing the value to be pushed on the stack it gets written directly to the space where that stack value will get copied after the function returns.

The reason why this cannot be done always (ie, by default) is that the space where the value is to be copied may contain information which is used during the execution of the function. But there are well-known methods to get around that, too. For example, copy-on-write is what operating systems do to blocks of memory shared by multiple processes. Compilers can, in the same way, do copy-on-write of space which is to be populated at the return of a function.

  • `this` is just a pointer. You don't need copy elision exceptions to be allowed to optimize it away. If the caller doesn't look at the `return` value, the “as-if” rule is all you need. And given that it can be passed in a single register, RVO doesn't seem very helpful either. – 5gon12eder Jan 28 '16 at 06:06
  • @5gon12eder, RVO is what allows it to be passed in the register or written directly in case of context switch. Otherwise, `return this` would have to push `this` on the stack before the function returned. – Dmitry Rubanovich Jan 28 '16 at 10:39
  • My point is that the C++ standard isn't concerned about calling conventions, registers or the stack. And since a pointer is trivially copyable, an implementation has all the freedom it needs to copy (or not copy) it however it likes. It doesn't need exceptional permission to elide copies beyond the “as-if” rule as for RVO. – 5gon12eder Jan 28 '16 at 11:17
  • @5gon12eder, what if you are not doing `return this`, but something like `return weak_this.lock()`, where `weak_this` is a weak_ptr of a shared_ptr tracking `this`? It's not trivial at that point. And this can still be used to chain functions on objects tracked by shared_ptr's. I think I am mostly being a devil's advocate though at this point though. I think you are probably right about copy elision not being necessary in this case. – Dmitry Rubanovich Jan 28 '16 at 12:44