10

I have the following extension method:

    public static IEnumerable<T> Apply<T>(
        [NotNull] this IEnumerable<T> source,
        [NotNull] Action<T> action)
        where T : class
    {
        source.CheckArgumentNull("source");
        action.CheckArgumentNull("action");
        return source.ApplyIterator(action);
    }

    private static IEnumerable<T> ApplyIterator<T>(this IEnumerable<T> source, Action<T> action)
        where T : class
    {
        foreach (var item in source)
        {
            action(item);
            yield return item;
        }
    }

It just applies an action to each item of the sequence before returning it.

I was wondering if I should apply the Pure attribute (from Resharper annotations) to this method, and I can see arguments for and against it.

Pros:

  • strictly speaking, it is pure; just calling it on a sequence doesn't alter the sequence (it returns a new sequence) or make any observable state change
  • calling it without using the result is clearly a mistake, since it has no effect unless the sequence is enumerated, so I'd like Resharper to warn me if I do that.

Cons:

  • even though the Apply method itself is pure, enumerating the resulting sequence will make observable state changes (which is the point of the method). For instance, items.Apply(i => i.Count++) will change the values of the items every time it's enumerated. So applying the Pure attribute is probably misleading...

What do you think? Should I apply the attribute or not?

Thomas Levesque
  • 211
  • 2
  • 12
  • Related: http://stackoverflow.com/questions/23997554/could-pureattribute-only-be-guaranteed-when-manipulating-primitive-types – Den Aug 20 '14 at 19:24

5 Answers5

20

I disagree with both Euphoric and Robert Harvey's answers. Absolutely that is a pure function; the problem is that

It just applies an action to each item of the sequence before returning it.

is very unclear what the first "it" means. If "it" means one of those functions, then that's not right; neither of those functions do that; the MoveNext of the enumerator of the sequence does that, and it "returns" the item via the Current property, not by returning it.

Those sequences are enumerated lazily, not eagerly so it is certainly not the case that the action is applied before the sequence is returned by Apply. The action is applied after the sequence is returned, if MoveNext is called on an enumerator.

As you note, these functions take an action and a sequence and return a sequence; the output depends on the input, and no side effects are produced, so these are pure functions..

Now, if you create an enumerator of the resulting sequence and then call MoveNext on that iterator then the MoveNext method is not pure, because it calls the action and produces a side effect. But we already knew that MoveNext was not pure because it mutates the enumerator!

Now, as for your question, should you apply the attribute: I would not apply the attribute because I would not write this method in the first place. If I want to apply an action to a sequence then I write

foreach(var item in sequence) action(item);

which is nicely clear.

Eric Lippert
  • 45,799
  • 22
  • 87
  • 126
  • Thanks Eric! I followed roughly the same reasoning; I do think the method is pure (even though I accepted an answer that said otherwise), but I didn't apply the attribute, because it would be misleading. I also considered not writing this method at all, but sometimes I need to mutate the items "on the fly", before applying more operators to the sequence, e.g. `items.Apply(something).Where(...).GroupBy(...)...`. I could use a loop instead, but it would be less convenient. – Thomas Levesque Jun 06 '14 at 18:05
  • 2
    I guess this method falls in the same bag as the `ForEach` extension method, which is intentionally not part of Linq because its goal is to produce side effects... – Thomas Levesque Jun 06 '14 at 18:07
  • 1
    @ThomasLevesque: My advice is to **never ever do that**. A query should *answer a question*, not *mutate a sequence*; that's why they're called *queries*. Mutating the sequence as it is queried is *extraordinarily dangerous*. Consider for example what happens if such a query is then subjected to multiple calls to `Any()` over time; the action will be performed again and again, but only on the first item! A sequence should be a sequence of *values*; if you want a sequence of *actions* then make an `IEnumerable`. – Eric Lippert Jun 06 '14 at 18:48
  • 2
    This answer muddies the waters more than it illuminates. While everything you say is unquestionably true, the principles of immutability and purity are high-level programming language principles, not low-level implementation details. Programmers working at a *functional* level are interested in how their code *behaves at the functional level,* not whether or not its inner workings are *pure.* They're almost certainly *not pure* under the hood if you go low enough. We all generally run these things on Von Neumann architecture, which is most certainly not pure. – Robert Harvey Jun 12 '14 at 23:41
  • 1
    @robertharvey no, they are low level implementation details. Whether a method is pure from my perspective has nothing whatsoever to do with the psychological feelings that developers have about a method. Its entirely about what optimizations a compiler or runtime is allowed to make. Can it memoize the method? Can it execute it lazily? Can the method be split into little pieces and parallelized to multiple threads? And so on. – Eric Lippert Jun 13 '14 at 05:34
  • -1: I cannot believe you claim the function is pure. If `action` modifies `item` (it can), it's end of story. Now if you are able to restrict `action` to be pure, then you're onto something. (I'm no C# expert, but I don't believe its type system is powerful enough to denote such a restriction.) – Thomas Eding Aug 20 '14 at 19:28
  • 2
    @ThomasEding: The method doesn't call `action`, so the purity of `action` is irrelevant. I know it *looks* like it calls `action`, but this method is a syntactic sugar for two methods, one which returns an enumerator, and one which is the `MoveNext` of the enumerator. The former is clearly pure, and the latter clearly is not. Look at it this way: would you say that `IEnumerable ApplyIterator(whatever) { return new MyIterator(whatever); }` is pure? Because that's the function that this really is. – Eric Lippert Aug 20 '14 at 20:10
  • @EricLippert: I agree with your simplified example in the above comment. But unless I'm missing something, `ApplyIterator` calls `action` before it has a chance to yield, assuming the collection is non-empty. And since `Apply` calls `ApplyIterator`, that single call to `action` will be performed. – Thomas Eding Aug 20 '14 at 21:31
  • 3
    @ThomasEding: You are missing something; that's not how iterators work. The `ApplyIterator` method returns *immediately*. No code in the body of `ApplyIterator` is run until the first call to `MoveNext` on the returned object's enumerator. Now that you know that, you can deduce the answer to this puzzle: http://blogs.msdn.com/b/ericlippert/archive/2007/09/05/psychic-debugging-part-one.aspx The answer is here: http://blogs.msdn.com/b/ericlippert/archive/2007/09/06/psychic-debugging-part-two.aspx – Eric Lippert Aug 20 '14 at 22:13
  • @ThomasEding: Again, the questions that should be asked here regarding purity are questions like: *can I safely memoize this method?* and *can I safely call this method on multiple threads*? and so on. Clearly you *can* call `Apply` safely on multiple threads, clearly you *can* memoize the result; it's a pure function. `Apply(...).GetEnumerator().MoveNext()` is not a pure function; it mutates the enumerator. – Eric Lippert Aug 20 '14 at 22:17
  • 2
    it is pure, in a very similar way to how IO in haskell is pure, its a pure function that produces a sequence of action aka an imperative program. the difference here is that in haskell the sequence of actions can only be executed by the runtime, where as here you can execute it as many times as you like. So Eric is right it is a pure function, but also a *unsafe* function and should therefore be avoided – jk. Aug 21 '14 at 06:17
  • 2
    @jk: Indeed -- that is the thrust of my final paragraph. Arguing about whether this method is pure or not misses the point; don't write this method in the first place. It creates a sequence which produces both side effects and values every time it is enumerated, and that's unexpected. – Eric Lippert Aug 21 '14 at 13:08
  • There’s no reason to assume that GetEnumerator is a pure function, which means that the method may not be pure. For instance, the first could call result in an iterator that was first in last out, and a subsequent call could be last in first out, ordered by a particular property, change which property is being used to order the results, etc. More commonly, there’s no guarantee that a subsequent call is allowed, an IEnumerable may thrown on successive calls to GetEnumerator. – jmoreno Feb 22 '21 at 04:15
15

No it is not pure, because it has side effect. Concretely it is calling action on each item. Also, it is not threadsafe.

The major property of pure functions is that it can be called any number of times and it never does anything else than return same value. Which is not your case. Also, being pure means you don't use anything else than the input parameters. This means it can be called from any thread at any time and not cause any unexpected behavior. Again, that is not case of your function.

Also, you might be mistaken on one thing: function purity is not question of pros or cons. Even single doubt, that it can have side effect, is enough to make it not pure.

Eric Lippert raises a good point. I'm going to use http://msdn.microsoft.com/en-us/library/dd264808(v=vs.110).aspx as part of my counter-argument. Especially line

A pure method is allowed to modify objects that have been created after entry into the pure method.

Lets say we create method like this:

int Count<T>(IEnumerable<T> e)
{
    var enumerator = e.GetEnumerator();
    int count = 0;
    while (enumerator.MoveNext()) count ++;
    return count;
}

First, this assumes that GetEnumerator is pure too (I can't really find any source on that). If it is, then according to above rule, we can annotate this method with [Pure], because it only modifies instance that was created within the body itself. After that we can compose this and the ApplyIterator, which should result in pure function, right?

Count(ApplyIterator(source, action));

No. This composition is not pure, even when both Count and ApplyIterator are pure. But I might be building this argument on wrong premise. I think that the idea that instances created within the method are exempt from the purity rule is either wrong or at least not specific enough.

Euphoric
  • 36,735
  • 6
  • 78
  • 110
  • 1
    +1 function purity is not question of pros or cons. Function purity is a hint on usage and safety. Strangely enough the OP put in `where T : class`, however if the OP simply put `where T : strut` it WOULD be pure. – ArTs May 27 '14 at 04:00
  • 4
    I disagree with this answer. Calling `sequence.Apply(action)` has no side effect; if it does, state the side effect that it has. Now, calling `sequence.Apply(action).GetEnumerator().MoveNext()` has a side effect, but we already knew that; it mutates the enumerator! Why should `sequence.Apply(action)` be considered to be impure because calling `MoveNext` is impure, but `sequence.Where(predicate)` be considered pure? `sequence.Where(predicate).GetEnumerator().MoveNext()` is every bit as impure. – Eric Lippert Jun 06 '14 at 16:29
  • @EricLippert You raise a good point. But, wouldn't it be enough to just call GetEnumerator? Can we consider that Pure? – Euphoric Jun 06 '14 at 17:29
  • @Euphoric: What observable side effect does calling `GetEnumerator` produce, aside from allocating an enumerator in its initial state? – Eric Lippert Jun 06 '14 at 18:50
  • Re: your update: `Count` is pure only if `e.GetEnumerator().MoveNext()` produces no visible side effect. But when you compose `Count` with `ApplyIterator`, we know that `items.ApplyIterator(action).GetEnumerator().MoveNext()` calls `action`, and therefore (unless the action is the do-nothing action) `Count` is not pure. – Eric Lippert Jun 06 '14 at 18:54
  • 1
    @EricLippert Then why is it that Enumerable.Count is considered Pure by .NET's Code Contracts? I don't have link, but when I play with it in visual studio, I get warning when I use custom non-pure count, but the contract works just fine with Enumerable.Count. – Euphoric Jun 06 '14 at 21:04
  • @Euphoric: How should I know? I didn't annotate that method. Ask the person who did what their justification is. – Eric Lippert Aug 20 '14 at 20:12
  • 1
    The reason why you couldn’t find anything on the purity of IEnumerable.GetEnumerator is because there is no guarantee of the purity of IEnumerable.GetEnumerator. For two reasons, one that is actually two interfaces, GetEnumerator returns an IEnumerator that is only loosely coupled with the IEnumerable and so the IEnumerable can’t proscribe what the IEnumerator does. Two, not only can’t the IEnumerable not make promises for the IEnumerator, the IEunumerable makes no promises about successive calls to GetEnumerator. It would be possible to return **different** IEnumerators for each call to it. – jmoreno Feb 22 '21 at 03:58
3

It's not a pure function, so applying the Pure attribute is misleading.

Pure functions do not modify the original collection, and it doesn't matter whether you're passing an action that has no effect or not; it's still an impure function because its intent is to cause side effects.

If you want to make the function pure, copy the collection to a new collection, apply the changes that the Action takes to the new collection, and return the new collection, leaving the original collection unchanged.

Robert Harvey
  • 198,589
  • 55
  • 464
  • 673
  • Well, it doesn't modify the original collection, since it just returns a new sequence with the same items; this is why I was considering making it pure. But it might change the state of the items when you enumerate the result. – Thomas Levesque May 26 '14 at 19:41
  • If `item` is a reference type, it's modifying the original collection, even though you are returning `item` in an iterator. See http://stackoverflow.com/questions/1538301 – Robert Harvey May 26 '14 at 19:43
  • 1
    Even if he deep-copied the collection it still wouldn't be pure, as `action` may have side effects other than modifying the item passed to it. – Idan Arye May 26 '14 at 21:16
  • @IdanArye: True, the Action would also have to be pure. – Robert Harvey May 26 '14 at 22:08
  • But `Action` can't be pure since it returns nothing. – Idan Arye May 26 '14 at 23:06
  • @IdanArye: Right, it would have to be a `Func` – Robert Harvey May 26 '14 at 23:27
  • 1
    @IdanArye: `()=>{}` is convertible to Action, and it's a pure function. It's outputs depend solely on its inputs and it has no observable side effects. – Eric Lippert Jun 06 '14 at 16:21
  • @EricLippert One could also claim that it's not a function, since it doesn't have a return value. – Idan Arye Jun 06 '14 at 20:03
  • @IdanArye: It does return a value, `void` (aka the unit type and it's corresponding unit value). You just cannot syntactically write it per-se, but it's there, semantically. – Thomas Eding Aug 20 '14 at 19:34
0

In my opinion, the fact that it receives an Action (and not something like PureAction) makes it not pure.

And I even disagree with Eric Lippert. He wrote this "()=>{} is convertible to Action, and it's a pure function. It's outputs depend solely on its inputs and it has no observable side effects".

Well, imagine that instead of using a delegate the ApplyIterator was invoking a method named Action.

If Action is pure, then the ApplyIterator is pure too. If Action is not pure, then the ApplyIterator can't be pure.

Considering the delegate type (not the actual given value), we don't have the guarantee that it will be pure, so the method will behave as a pure method only when the delegate is pure. So, to make it really pure, it should receive a pure delegate (and that exists, we can declare a delegate as [Pure], so we can have a PureAction).

Explaining it differently, a Pure method should always give the same result given the same inputs and should not generate observable changes. ApplyIterator may be given the same source and delegate twice but, if the delegate is changing a reference-type, the next execution will give different results. Example: The delegate does something like item.Content += " Changed";

So, using the ApplyIterator over a list of "string containers" (an object with a Content property of type string), we may have these original values:

Test

Test2

After the first execution, the list will have this:

Test Changed

Test2 Changed

And this the 3rd time:

Test Changed Changed

Test2 Changed Changed

So, we are changing the contents of the list because the delegate is not pure and no optimization can be done to avoid executing the call 3 times if invoked 3 times, as each execution will generate a different result.

Paulo Zemek
  • 109
  • 2
0

Apply is potentially-pure. It's purity is dependent on the purity of the Action passed to it (and actually also the IEnumerable<T> passed in because you can make MoveNext for example make a side effect).

For example, if the passed Action is x => {}, then the function is pure, if it is x => Console.WriteLine(x), then the function is impure.

Yacoub Massad
  • 456
  • 3
  • 9
  • The method can be impure even if the action is `x => {}`. – jmoreno Feb 22 '21 at 06:02
  • @jmoreno, by having an impure implementation of `IEnumerabler` passed as an argument. But other than that how? – Yacoub Massad Feb 22 '21 at 13:19
  • IEnumerable.GetEnumerator isn't necessarily pure, it isn't necessarily the **same** enumerator each time. Yes, this is being pendant, but with stuff like pure that is what you need/want. Otherwise you could say any method is pure, if it has a code path that doesn't modify anything. It's not useful if it's not comprehensive. – jmoreno Feb 22 '21 at 16:36
  • @jmoreno, yes. If you pass an `IEnumerable` that is not pure (e.g. source.GetEnumerator().MoveNext is not pure), then the method is not pure. Therefore, Apply is potentially-pure. I.e., it has a pure body but uses abstract parameters that might be pure or impure. This potentially-pure concept is very helpful. If you compose 100 functions/objects that are potentially-pure, you can then use them in a test to inject pure (fake) functions to them, or you can use them in production with impure functions. – Yacoub Massad Feb 22 '21 at 17:05
  • @jmoreno, not any method is pure. A method that calls an impure method in its body is not pure. – Yacoub Massad Feb 22 '21 at 17:06