111

I'm developing a library intended for public release. It contains various methods for operating on sets of objects - generating, inspecting, partitioning and projecting the sets into new forms. In case it's relevant, it's a C# class library containing LINQ-style extensions on IEnumerable, to be released as a NuGet package.

Some of the methods in this library can be given unsatisfiable input parameters. For example, in the combinatoric methods, there is a method to generate all sets of n items that can be constructed from a source set of m items. For example, given the set:

1, 2, 3, 4, 5

and asking for combinations of 2 would produce:

1, 2
1, 3
1, 4
etc...
5, 3
5, 4

Now, it's obviously possible to ask for something that can't be done, like giving it a set of 3 items and then asking for combinations of 4 items while setting the option that says it can only use each item once.

In this scenario, each parameter is individually valid:

  • The source collection is not null, and does contain items
  • The requested size of combinations is a positive nonzero integer
  • The requested mode (use each item only once) is a valid choice

However, the state of the parameters when taken together causes problems.

In this scenario, would you expect the method to throw an exception (eg. InvalidOperationException), or return an empty collection? Either seems valid to me:

  • You can't produce combinations of n items from a set of m items where n > m if you're only allowed to use each item once, so this operation can be deemed impossible, hence InvalidOperationException.
  • The set of combinations of size n that can be produced from m items when n > m is an empty set; no combinations can be produced.

The argument for an empty set

My first concern is that an exception prevents idiomatic LINQ-style chaining of methods when you're dealing with datasets that may have unknown size. In other words, you might want to do something like this:

var result = someInputSet
    .CombinationsOf(4, CombinationsGenerationMode.Distinct)
    .Select(combo => /* do some operation to a combination */)
    .ToList();

If your input set is of variable size, this code's behaviour is unpredictable. If .CombinationsOf() throws an exception when someInputSet has fewer than 4 elements, then this code will sometimes fail at runtime without some pre-checking. In the above example this checking is trivial, but if you're calling it halfway down a longer chain of LINQ then this might get tedious. If it returns an empty set, then result will be empty, which you may be perfectly happy with.

The argument for an exception

My second concern is that returning an empty set might hide problems - if you're calling this method halfway down a chain of LINQ and it quietly returns an empty set, then you may run into issues some steps later, or find yourself with an empty result set, and it may not be obvious how that happened given that you definitely had something in the input set.

What would you expect, and what's your argument for it?

ErikE
  • 1,151
  • 1
  • 8
  • 18
anaximander
  • 2,285
  • 3
  • 19
  • 21
  • Is returning a `Nullable` a viable solution (given LINQ)? – Matthieu M. Nov 30 '16 at 14:58
  • 66
    Since the empty set is mathematically correct, chances are that when you get it it actually is what you want. Mathematical definitions and conventions are generally chosen for consistency and convenience so that things just work out with them. – asmeurer Nov 30 '16 at 17:00
  • 5
    @asmeurer They're chosen so that *theorems* are consistent and convenient. They're not chosen to make programming easier. (That is sometimes a side benefit, but sometimes they make programming harder, too.) – jpmc26 Nov 30 '16 at 17:16
  • 1
    "If your input set is of variable size, this code's behaviour is unpredictable." And if you don't, the code that *uses* `result` may be unpredictable. – jpmc26 Nov 30 '16 at 17:18
  • What do you do when `m % n != 0` (and they have to be distinct elements), i.e. you can't cleanly distribute the input across the output ranges? – Xeo Nov 30 '16 at 20:09
  • @Xeo This method doesn't partition the input set, it produces combinations. So, asking for combinations of 2 out of ` { 1, 2, 3 }` gives `{ { 1, 2 }, { 1, 3 }, { 2, 1 }, { 2, 3 }, { 3, 1 }, { 3, 2 } }`. (There is a seperate method that partitions in the way you're thinking; it returns an partitioned enumerable type which has a `Remainder` property.) – anaximander Nov 30 '16 at 20:28
  • 7
    @jpmc26 "They're chosen so that theorems are consistent and convenient" - making sure that your program always works as expected is essentially equivalent to proving a theorem. – artem Nov 30 '16 at 20:58
  • @artem No, it isn't. A very pure form of functional programming is as close to you can get as that, but then there might still be flaws in the interpreter or compiler that make your assumptions false. Testing a program is much more akin to obtaining a statistical *confidence level* than it is to proving theorems, and it isn't even statistically rigorous in practice. And even with functional code, *very, very* few people are actually paid to do the math and construct a formal proof of correctness. (I dare say it's most likely just plain not done outside of research.) – jpmc26 Nov 30 '16 at 21:41
  • 2
    @jpmc26 I don't get why you mentioned functional programming. Proving correctness for imperative programs is quite possible, always advantageous, and can be done informally as well - just think a little bit and use common mathematical sense when you write your program, and you will spend less time testing. Statistically proven on sample of one ;-) – artem Nov 30 '16 at 22:22
  • "If .CombinationsOf() throws an exception when someInputSet has fewer than 4 elements, then this code will sometimes fail at runtime without some pre-checking. In the above example this checking is trivial, but if you're calling it halfway down a longer chain of LINQ then this might get tedious." But I assume you'll still throw an exception anyway, if say you're given null for the enumerable to use, or in the other cases of individual parameters being "invalid." So I don't really think an exception due to the combination of parameter values is ruled out by that logic. – Andy Dec 01 '16 at 02:25
  • You state the combinatoric methods as an example of the methods in question. In disregard of my answer posted, I am convinced now that returning an empty set makes sense for the combinatoric methods. But how does it look like for the other methods you have in your library? – sbecker Dec 01 '16 at 06:40
  • 1
    @Andy Looking at the existing LINQ methods, there's a whole bunch of ways that a collection could become empty partway down a chain, but there are much fewer ways you can end up operating on `null` without explicitly allowing it. – anaximander Dec 01 '16 at 09:14
  • 1
    @asmeurer Infinity is also mathematically correct, yet many programming languages will throw an exception if you divide by zero. This is one of the cases where the principle of least astonishment takes over mathematical correctness. – Dmitry Grigoryev Dec 01 '16 at 10:54
  • 5
    @DmitryGrigoryev 1/0 as undefined is much more mathematically correct than 1/0 as infinity. – asmeurer Dec 01 '16 at 17:16
  • @anaximander I wasn't saying to return null; I was stating that there are times Linq extension methods will throw based on the parameters they are given (for example, if the enumerable is null). I was just pointing out that you're already planning on throwing from your extension method, so your consumers are already expecting the possibility of an exception, and throwing an unexpected exception "halfway down a longer call chain of LINQ" isn't something to be considered. – Andy Dec 01 '16 at 22:59
  • If I were developing a program that used your library, and that program took user input, I'd rather it return an empty set to spare me the needless exception handling. – Seiyria Dec 03 '16 at 00:21
  • If you are worried that an "empty set might hide problems", could add a debug/logging mode to your library that would perhaps log such potential _warnings_? – MrWhite Dec 03 '16 at 12:07
  • 1
    If the request is unsatisfiable *because of the content of the data*, then I would return the empty set. But if the request would be unsatisfiable *no matter what the data set was*, then I would raise an exception. That said, this tends to be largely a matter of culture and prevailing style for the development language/framework. – RBarryYoung Dec 04 '16 at 02:51
  • This question reminds me of a certain Entity Framework design decision. If you have a one-to-many foreign key navigation property on an entity (e.g. `public virtual IEnumerable Children`), and there are zero children present in your database that match the given foreign key, then performing a "get" on `.Children` returns, not an empty IEnumerable as one would expect, but a **null** IEnumerable... – JustAskin Dec 05 '16 at 04:48

13 Answers13

144

Return an Empty Set

I would expect an empty set because:

There are 0 combinations of 4 numbers from the set of 3 when i can only use each number once

Ewan
  • 70,664
  • 5
  • 76
  • 161
  • 5
    Mathematically, yes, but it's also very likely a source of error. If this type of input is expected, it might be better to require that the user catch the exception in a general purpose library. – Casey Kuball Nov 30 '16 at 16:24
  • 57
    I disagree that it's a "very likely" cause of error. Suppose, for example, you were implementing a naive "matchmaknig" algorithm from a largish input set. You might ask for all combinations of two items, find the single "best" match among them, and then remove both elements and start over with the new, smaller set. Eventually your set will be empty, and there will be no pairs left to consider: an exception at this point is obstructive. I think there are a lot of ways to end up in a similar situation. – amalloy Nov 30 '16 at 17:09
  • @amalloy However if you ended up with one item in your matchmaking set you'd want an exception. – user253751 Dec 01 '16 at 04:28
  • 11
    Actually you do not know if this is an error in the eyes of the user. The empty set is something the user should check for, if necessary. if(result.Any()) DoSomething(result.First()); else DoSomethingElse(); is much better than try{result.first().dosomething();}catch{DoSomethingElse();} – Guran Dec 01 '16 at 09:25
  • 1
    For a library, I'd go with not throwing an exception for this type of "edge case", since you (the library writer) can't know whether it "really matters" to the user: if they do, they should be able to wrap the call in a check-and-throw function. (Alternatively, you could provide both a "normal" and a check-and-throw version for selected methods yourself). – TripeHound Dec 01 '16 at 12:01
  • 4
    @TripeHound That's what tipped the decision in the end: requiring the developer using this method to check, and then throw, has a much smaller impact than throwing an exception that they don't want, in terms of development effort, program performance, and simplicity of program flow. – anaximander Dec 01 '16 at 14:12
  • 1
    I like this result because the less exceptions I have to deal with from lower level libraries the better. As people have mentioned the question "how many combinations of 3 numbers with no repeats" is perfectly valid. Now once the calling library gets this empty set it can decide if that's valid or not and throw its own exception if need be. – Captain Man Dec 01 '16 at 16:18
  • 8
    @Darthfett Try comparing this to an existing LINQ extension method: Where(). If I have a clause: Where(x => 1 == 2) I don't get an exception, I get an empty set. – Necoras Dec 01 '16 at 18:25
  • @Guran See my answer for a potentially better way to check than `Any()` followed by `DoSomething`, because doing it that way could begin enumeration on an enumerable twice (where the extension method in my answer caches the first item and re-emits it without restarting enumeration). – ErikE Dec 03 '16 at 01:34
  • Not saying I disagree with this answer, but it is purely opinion based and cites nothing in support. Many of the lower rated answers cite other practice. – user949300 Dec 03 '16 at 04:33
  • totally agree. But essentially the question is an opinion poll. If you write this library you need to know how users will **expect** it to work. I deliberately kept it short because it doesnt matter **why** you agree as long as you are getting what you expect. If i added my reasoning the vote would be split – Ewan Dec 03 '16 at 09:46
  • @ErikE Yeah, that's definitely better. (Mine was merely an example of how an end user might use a component, not intended as best practice :) ) – Guran Dec 05 '16 at 08:26
81

When in doubt, ask someone else.

Your example function has a very similar one in Python: itertools.combinations. Let's see how it works:

>>> import itertools
>>> input = [1, 2, 3, 4, 5]
>>> list(itertools.combinations(input, 2))
[(1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)]
>>> list(itertools.combinations(input, 5))
[(1, 2, 3, 4, 5)]
>>> list(itertools.combinations(input, 6))
[]

And it feels perfectly fine to me. I was expecting a result that I could iterate over and I got one.

But, obviously, if you were to ask something stupid:

>>> list(itertools.combinations(input, -1))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: r must be non-negative

So I'd say, if all your parameters validate but the result is an empty set return an empty set, you’re not the only one doing it.


As said by @Bakuriu in the comments, this is also the same for an SQL query like SELECT <columns> FROM <table> WHERE <conditions>. As long as <columns>, <table>, <conditions> are well formed formed and refer to existing names, you can build a set of conditions that exclude each other. The resulting query would just yield no rows instead of throwing an InvalidConditionsError.

  • 4
    Downvoted because it's not idiomatic in the C#/Linq language space (see sbecker's answer for how similar out of bounds problems are handled in the language). If this were a Python question it'd be a +1 from me though. – James Snell Nov 30 '16 at 14:15
  • 33
    @JamesSnell I barely see how it relates to the out-of-bounds problem. We’re not picking elements by indexes to reorder them, we’re picking `n` elements in a collection to count the number of way we can pick them. If at some point there is no elements left while picking them, then there is no (0) way of picking `n` elements from said collection. – 301_Moved_Permanently Nov 30 '16 at 14:27
  • 1
    Still, Python is not a good oracle for C# questions: for example C# would allow `1.0 / 0`, Python wouldn't. – Dmitry Grigoryev Dec 01 '16 at 11:02
  • 2
    @MathiasEttinger Python's guidelines for how and when to use exceptions are very different to C#'s, so using behaviour from Python modules as a guideline is not a good metric for C#. – Ant P Dec 01 '16 at 13:26
  • 2
    Instead of picking Python we could just pick SQL. Does a *well-formed* `SELECT` query over a table produce an exception when the result set is empty? (spoiler: no!). The "well-formedness" of a query depends only on the query itself and some *metadata* (e.g. does the table exists and have this field (with this type)?) Once the query is well-formed and you execute it you expect either a (possibly empty) valid result or an exception because the "server" had a problem (e.g. db server went down or something). – Bakuriu Dec 05 '16 at 08:56
71

In layman's terms:

  • If there is an error, you should raise an exception. That may involve doing things in steps instead of in a single chained call in order to know exactly where the error happened.
  • If there is no error but the resulting set is empty, don't raise an exception, return the empty set. An empty set is a valid set.
Tulains Córdova
  • 39,201
  • 12
  • 97
  • 154
  • 23
    +1, 0 is a perfectly valid, and indeed extremely important number. There are so many cases where a query might legitimately return zero hits that raising an exception would be highly annoying. – Kilian Foth Nov 30 '16 at 13:07
  • 19
    good advice, but is your answer clear? – Ewan Nov 30 '16 at 13:51
  • @Ewan My advice is not to hide problems and if there are no problems, return the empty set, because an empty set is a valid set. OP assumes returning an empty set equals hiding problems, but that is because he is not raising exceptions when he should. – Tulains Córdova Nov 30 '16 at 13:57
  • 54
    I'm still none the wiser as to if this answer suggests the OP should `throw` or return an empty set... – James Snell Nov 30 '16 at 14:19
  • 7
    Your answer says I could do either, depending on whether there is an error, but you don't mention whether you would consider the example scenario to be an error. In the comment above you say I should return an empty set "if there are no problems", but again, unsatisfiable conditions could be considered a problem, and you go on to say that I am not raising exceptions when I should. So which should I do? – anaximander Nov 30 '16 at 14:21
  • @Ewan Yes, the answer is very clear. Why do you think otherwise? – BЈовић Nov 30 '16 at 14:25
  • well because you could return an empty set for an input of null or -1 as well – Ewan Nov 30 '16 at 14:30
  • @anaximander You should break the "idiomatic LINQ-style chaining of methods" and do the thing step by step in order to throw en exception in exactly the place where the error occurs. If no error happens and yet you don't find any data to populate the set, don't throw an exception just because the set is empty, but return it. – Tulains Córdova Nov 30 '16 at 14:58
  • 3
    Ah, I see - you may have misunderstood; I'm not chaining anything, I'm writing a method that I expect to be used in a chain. I'm wondering whether the hypothetical future developer who uses this method would want it to throw in the above scenario. – anaximander Nov 30 '16 at 16:03
  • 3
    @anaximander : Sounds like you believe you want two methods: `CombinationsOf()` and `CombinationsOfChecked()`. It should be clear which one throws and which one doesn't. – Eric Towers Nov 30 '16 at 17:19
  • @anaximander I think the answer still applies. – Tulains Córdova Nov 30 '16 at 17:23
  • 1
    If an empty set indicates a bug or would cause one down the chain, you should probably have the **caller** check the result and throw the exception. However, playing Devil’s advocate, there might be an application where mathematical precision is misleading and throwing the exception would simplify the control flow. – Davislor Nov 30 '16 at 21:59
53

I agree with Ewan's answer but want to add a specific reasoning.

You are dealing with mathematical operations, so it might be a good advice to stick with the same mathematical definitions. From a mathematical standpoint the number of r-sets of an n-set (i.e. nCr) is well defined for all r > n >= 0. It is zero. Therefore returning an empty set would be the expected case from a mathematical standpoint.

sigy
  • 716
  • 4
  • 8
  • 10
    This is a good point. If the library were higher level, like picking combinations of colors to make a palette -- then it makes sense to throw an error. Because you *know* a palette with no colors isn't a palette. But a set with no entries is still a set and the math does define it as equaling an empty set. – Captain Man Dec 01 '16 at 16:23
  • 1
    Much better than Ewans answer cause it cites mathematical practice. +1 – user949300 Dec 03 '16 at 04:35
24

I find a good way of determining whether to use an exception, is to imagine people being involved in the transaction.

Taking fetching the contents of a file as an example:

  1. Please fetch me the contents of file, "doesn't exist.txt"

    a. "Here's the contents: an empty collection of characters"

    b. "Erm, there's a problem, that file doesn't exist. I don't know what to do!"

  2. Please fetch me the contents of file, "exists but is empty.txt"

    a. "Here's the contents: an empty collection of characters"

    b. "Erm, there's a problem, there's nothing in this file. I don't know what to do!"

No doubt some will disagree, but to most folk, "Erm, there's a problem" makes sense when the file doesn't exist and returning "an empty collection of characters" when the file is empty.

So applying the same approach to your example:

  1. Please give me all all combinations of 4 items for {1, 2, 3}

    a. There aren't any, here's an empty set.

    b. There's a problem, I don't know what to do.

Again, "There's a problem" would make sense if eg null were offered as the set of items, but "here's an empty set" seems a sensible response to the above request.

If returning an empty value masks a problem (eg a missing file, a null), then an exception generally should be used instead (unless your chosen language supports option/maybe types, then they sometimes make more sense). Otherwise, returning an empty value will likely simplify the cost and complies better with the principle of least astonishment.

David Arno
  • 38,972
  • 9
  • 88
  • 121
  • 4
    This advice is good, but does not apply to this case. A better example: How many days after the 30:th of january?: 1 How many days after the 30:th of february?: 0 How many days after the 30:th of madeupuary?: Exception How many working hours on the 30:th of february: Exception – Guran Dec 01 '16 at 09:35
10

As it's for a general purpose library my instinct would be Let the end user choose.

Much like we have Parse() and TryParse() available to us we can have the option of which we use depending on what output we need from the function. You'd spend less time writing and maintaining a function wrapper to throw the exception than arguing over choosing a single version of the function.

James Snell
  • 3,168
  • 14
  • 17
  • 3
    +1 because I always like the idea of choice and because this pattern has a tendency to encourage programmers to validate their own input. The mere existence of a TryFoo function makes it obvious that certain input combinations can cause problems and if the documentation explains what those potential problems are, the coder can check for them and handle invalid inputs in a cleaner fashion than they could by just catching an exception or responding to an empty set. Or they can be lazy. Either way, the decision is theirs. – aleppke Nov 30 '16 at 15:45
  • 20
    No, an empty set is the mathematically correct answer. Doing something else is redefining firmly agreed-upon mathematics, which shouldn't be done on a whim. While we're at it, shall we redefine the exponentiation function to throw an error if the exponent is equal to zero, because we decide we don't like the convention that any number raised to the "zeroth" power is equal to 1? – Wildcard Nov 30 '16 at 17:51
  • @Wildcard there is a difference between "redefining" functions and just adding alternatives that behave differently. – Amani Kilumanga Dec 01 '16 at 08:59
  • 1
    I was about to answer something similar. The Linq extension methods `Single()` and `SingleOrDefault()` immediately sprung to mind. Single throws an exception if there is zero or > 1 result, while SingleOrDefault won't throw, and instead returns `default(T)`. Perhaps OP could use `CombinationsOf()` and `CombinationsOfOrThrow()`. – RubberDuck Dec 01 '16 at 10:59
  • @Wildcard - I'm not sold on that argument, though my point is more for the time spent arguing about it here and presumably in the OP's office, just write both and have done with it... it could equally be written with an optional `bool` validate in the method signature or a handful of ways, but the point is it's not worth the energy invested. – James Snell Dec 01 '16 at 19:55
  • See my previous comment. [Devil's advocate](https://en.wikipedia.org/wiki/Devil%27s_advocate) here, but try it out: *I* don't like that exponentiation defaults to 1 if the provided exponent is 0. I think it should throw an error, and we should write our exponentiation function that way. Shall we just write both `Pow()` and `TryPow()` and have done with it because it's not worth the energy invested to argue the point? – Wildcard Dec 01 '16 at 21:26
  • 1
    @Wildcard - you're talking about two different results to the same input which is not the same thing. The OP is only interested in choosing a) the result or b) throwing an exception which is *not* a result. – James Snell Dec 01 '16 at 23:08
  • This is true for queries that can (and do) fail under certain circumstances. There is no "natural" and generic fallback-behavior for parsing, especially in strongly typed languages. But returning an empty list if a list is expected is quite natural. Consider the query "What even primes greater than two exist?" The natural answer would be "none", which is equivalent to an empty list. Hence the distinction is not at all necessary, but noisy, less clean and contradicts the principle of least astonishment. – Paul Kertscher Dec 02 '16 at 11:52
  • 4
    `Single` and `SingleOrDefault` (or `First`and `FirstOrDefault`) are a completely different story, @RubberDuck. Picking up my example from my previous comment. It's perfectly fine to anser _none_ to the query _"What even primes greater than two exist?"_, since this is a mathematically sound and sensible answer. Anyway, if you are asking _"What is the first even prime greater than two?"_ there is no natural answer. You just cannot answer the question, because you are asking for a single number. It's not _none_ (it's not a single number, but a set), not 0. Hence we throw an exception. – Paul Kertscher Dec 02 '16 at 11:58
  • Your last comment is a very persuading argument @PaulK – RubberDuck Dec 02 '16 at 12:25
4

You need to validate the arguments provided when your function is called. And as a matter of fact, you want to know how to handle invalid arguments. The fact that multiple arguments depend on each other, doesn't make up for the fact that you validate the arguments.

Thus I would vote for the ArgumentException providing the necessary information for the user to understand what went wrong.

As an example, check the public static TSource ElementAt<TSource>(this IEnumerable<TSource>, Int32) function in Linq. Which throws an ArgumentOutOfRangeException if the index is less than 0 or greater than or equal to the number of elements in source. Thus the index is validated in regards to the enumerable provided by the caller.

sbecker
  • 189
  • 5
  • 3
    +1 for citing an example of a similar situation in the language. – James Snell Nov 30 '16 at 14:21
  • 13
    Oddly, your example got me thinking, and led me to something that I think makes a strong counter-example. As I noted, this method is designed to be LINQ-like, so that users can chain it together with other LINQ methods. If you do `new[] { 1, 2, 3 }.Skip(4).ToList();` you get an empty set, which makes me think that returning an empty set is perhaps the better option here. – anaximander Nov 30 '16 at 14:30
  • An interesting point. In which case it looks like both options would appear to be linguistically acceptable. – James Snell Nov 30 '16 at 14:40
  • 15
    This is not a language semantics issue, the semantics of the problem domain already provides the correct answer, that is to return the empty set. Doing anything else violates the principle of least astonishment for someone working in the combinatoric problem domain. – Ukko Nov 30 '16 at 15:48
  • 1
    I'm starting to understand the arguments for returning an empty set. Why it would make sense. At least for this particular example. – sbecker Dec 01 '16 at 06:36
  • 2
    @sbecker, to bring the point home: If the question were "How *many* combinations are there of ..." then "zero" would be a completely valid answer. By the same token, "What *are* the combinations such that ..." has the empty set as a completely valid answer. If the question were, "What is the *first* combination such that ...", then and only then would an exception be appropriate, since the question is not answerable. See also [Paul K's comment](http://softwareengineering.stackexchange.com/questions/337186/#comment721820_337201). – Wildcard Dec 02 '16 at 18:34
3

You should do one of the following (though continuing to consistently throw on basic problems such as a negative number of combinations):

  1. Provide two implementations, one that returns an empty set when the inputs together are nonsensical, and one that throws. Try calling them CombinationsOf and CombinationsOfWithInputCheck. Or whatever you like. You can reverse this so the input-checking one is the shorter name and the list one is CombinationsOfAllowInconsistentParameters.

  2. For Linq methods, return the empty IEnumerable on the exact premise you've outlined. Then, add these Linq methods to your library:

    public static class EnumerableExtensions {
       public static IEnumerable<T> ThrowIfEmpty<T>(this IEnumerable<T> input) {
          return input.IfEmpty<T>(() => {
             throw new InvalidOperationException("An enumerable was unexpectedly empty");
          });
       }
    
       public static IEnumerable<T> IfEmpty<T>(
          this IEnumerable<T> input,
          Action callbackIfEmpty
       ) {
          var enumerator = input.GetEnumerator();
          if (!enumerator.MoveNext()) {
             // Safe because if this throws, we'll never run the return statement below
             callbackIfEmpty();
          }
          return EnumeratePrimedEnumerator(enumerator);
       }
    
       private static IEnumerable<T> EnumeratePrimedEnumerator<T>(
          IEnumerator<T> primedEnumerator
       ) {
          yield return primedEnumerator.Current;
          while (primedEnumerator.MoveNext()) {
             yield return primedEnumerator.Current;
          }
       }
    }
    

    Finally, use that like so:

    var result = someInputSet
       .CombinationsOf(4, CombinationsGenerationMode.Distinct)
       .ThrowIfEmpty()
       .Select(combo => /* do some operation to a combination */)
       .ToList();
    

    or like this:

    var result = someInputSet
       .CombinationsOf(4, CombinationsGenerationMode.Distinct)
       .IfEmpty(() => _log.Warning(
          $@"Unexpectedly received no results when creating combinations for {
             nameof(someInputSet)}"
       ))
       .Select(combo => /* do some operation to a combination */)
       .ToList();
    

    Please note that the private method being different from the public ones is required for the throwing or action behavior to occur when the linq chain is created instead of some time later when it is enumerated. You want it to throw right away.

    Note, however, that of course it has to enumerate at least the first item in order to determine if there are any items. This is a potential drawback that I think is mostly mitigated by the fact that any future viewers can quite easily reason that a ThrowIfEmpty method has to enumerate at least one item, so should not be surprised by it doing so. But you never know. You could make this more explicit ThrowIfEmptyByEnumeratingAndReEmittingFirstItem. But that seems like gigantic overkill.

I think #2 is quite, well, awesome! Now the power is in the calling code, and the next reader of the code will understand exactly what it's doing, and won't have to deal with unexpected exceptions.

ErikE
  • 1,151
  • 1
  • 8
  • 18
  • Please explain what you mean. How is something in my post "not behaving like a set" and why is that a bad thing? – ErikE Dec 03 '16 at 02:34
  • You are basically doin the same of my answer, instead of wrapping the query you happend to query a If Condition, the result does not change u.u – CoffeDeveloper Dec 03 '16 at 02:35
  • I can't see how my answer is "basically doin the same of" your answer. They seem completely different, to me. – ErikE Dec 03 '16 at 02:36
  • Basically you are just enforcing the same condition of "my bad class". But you are not telling that ^^. Do you agree, tht you are enforcing the existence of 1 element in the query result? – CoffeDeveloper Dec 03 '16 at 02:38
  • You're going to have to speak more clearly for the evidently stupid programmers who still don't get what you're talking about. What class is bad? Why is it bad? I'm not enforcing anything. I'm allowing the USER of the class to decide the final behavior instead of the CREATOR of the class. This allows one to use a consistent coding paradigm where Linq methods don't randomly throw because of a computational aspect that isn't really a violation of normal rules such as if you asked for -1 items at a time. – ErikE Dec 03 '16 at 02:39
  • You criticized my answer (in comments that are now moved to chat, but still accessible), that my class is a bad example because it enforce existence of at least 1 element in the result. However that comes by design because it's implicit in OP question, now you provided again a answer and your answer enforce again existence of at least 1 element (thing that you criticized in my answer). So what are you exactly trying do to? – CoffeDeveloper Dec 03 '16 at 02:41
  • If you can't see the difference between the **utility class** throwing, and the **consumer** throwing, you correspondingly won't be able to understand why my answer and your answer are completely different. – ErikE Dec 03 '16 at 02:44
  • Different answers that have in common one detail you criticized, you are free to put the answer you want, but my doubt is" why you criticize something and then do what you criticized"? Regardless in which context you put that. – CoffeDeveloper Dec 03 '16 at 03:03
  • I'm done discussing. We'll let the court of stackoverflow viewers decide by their votes. Check back in a year and we'll see what has happened, and whether with more experience under your belt (as I'm sure you'll think about this in the coming months, as will I) you may change your mind. – ErikE Dec 03 '16 at 03:04
  • Note I'm not criticizing you answer in any way because infact I think it is good. What **you don't understand** instead, and I'll not change mind about that, never, Is that I'm criticizing your comments to my answer, they are unproper comments. – CoffeDeveloper Dec 03 '16 at 03:15
  • I'm allowed to be a hypocrite. It's the internet. People comment on your posts. You can defend yourself or leave it alone. I understand what you're saying, and it's not important to me because I don't agree, but I understand why you think the way you do. Thank you for the feedback. Have a good weekend! – ErikE Dec 03 '16 at 03:16
  • +1 for the answer, but -100 for trolling in comments then ^^ – CoffeDeveloper Dec 03 '16 at 03:18
  • Thanks for the upvote. Calling an action trolling is sometimes, itself, trolling. I honestly hope that my comments on your answer help you and others write better code. – ErikE Dec 03 '16 at 03:20
2

I can see arguments for both use cases - an exception is great if downstream code expects sets which contain data. On the other hand, simply an empty set is great if if this is expected.

I think it depends on the expectations of the caller if this is an error, or an acceptable result - so I would transfer the choice to the caller. Maybe introduce an option?

.CombinationsOf(4, CombinationsGenerationMode.Distinct, Options.AllowEmptySets)

Christian Sauer
  • 1,269
  • 1
  • 9
  • 16
  • 10
    While I get where you're going with this, I do try to avoid methods having lots of options passed in that modify their behaviour. I'm still not 100% happy with the mode enum that's there already; I really don't like the idea of an option parameter that changes a method's exception behaviour. I feel like if a method needs to throw an exception, it needs to throw; if you want to hide that exception that's your decision as calling code, and not something you can make my code do with the right input option. – anaximander Nov 30 '16 at 14:25
  • If the caller expects a set that isn't empty, then it's up to the caller to either handle an empty set or throw an exception. As far as the callee is concerned, an empty set is a perfectly fine answer. And you can have more than one caller, with different opinions about whether empty sets are fine or not. – gnasher729 Dec 03 '16 at 20:11
2

There are two approaches to decide if there's no obvious answer:

  • Write code assuming first one option, then the other. Consider which one would work best in practice.

  • Add a "strict" boolean parameter to indicate whether you want the parameters to be strictly verified or not. For example, Java's SimpleDateFormat has a setLenient method to attempt parsing inputs that don't fully match the format. Of course, you'd have to decide what the default is.

JollyJoker
  • 186
  • 5
2

Based on your own analysis, returning the empty set seems clearly right — you've even identified it as something some users may actually want and have not fallen into the trap of forbidding some usage because you can't imagine users ever wanting to use it that way.

If you really feel that some users may want to force nonempty returns, then give them a way to ask for that behavior rather than forcing it on everyone. For example, you might:

  • Make it a configuration option on whatever object is performing the action for the user.
  • Make it a flag the user can optionally pass into the function.
  • Provide an AssertNonempty check they can put into their chains.
  • Make two functions, one that asserts nonempty and one that does not.
  • this doesn't seem to offer anything substantial over points made and explained in prior 12 answers – gnat Dec 01 '16 at 06:35
1

It really depends on what your users expect to get. For (a somewhat unrelated) example if your code performs division, you may either throw an exception or return Inf or NaN when you divide by zero. Neither is right or wrong, however:

  • if you return Inf in a Python library, people will assault you for hiding errors
  • if you raise an error in a Matlab library, people will assault you for failing to process data with missing values

In your case, I'd pick the solution which will be least astonishing for end users. Since you're developing a library dealing with sets, an empty set seems like something your users would expect to deal with, so returning it sounds like a sensible thing to do. But I may be mistaken: you have a much better understanding of the context than anyone else here, so if you expect your users to rely on the set always being not empty, you should throw an exception right away.

Solutions which let the user choose (like adding a "strict" parameter) aren't definitive, since they replace the original question with a new equivalent one: "Which value of strict should be the default?"

Dmitry Grigoryev
  • 494
  • 2
  • 13
  • 2
    I thinked about using the NaN argument in my answer and I share your opinion. We cannot tell more without knowing the domain of the OP – CoffeDeveloper Dec 03 '16 at 03:26
0

It is common conception (in mathematics) that when you select elements over a set you could find no element and hence you obtain an empty set. Of course you have to be consistent with mathematics if you go this way:

Common set rules:

  • Set.Foreach(predicate); // always returns true for empty sets
  • Set.Exists(predicate); // always returns false for empty sets

Your question is very subtle:

  • It could be that input of your function has to respect a contract: In that case any invalid input should raise a exception, that's it, the function is not working under regular parameters.

  • It could be that input of your function has to behave exactly like a set, and therefore should be able to return an empty set.

Now if I were in you I would go the "Set" way, but with a big "BUT".

Assume you have a collection that "by hypotesis" should have only female students:

class FemaleClass{

    FemaleStudent GetAnyFemale(){
        var femaleSet= mySet.Select( x=> x.IsFemale());
        if(femaleSet.IsEmpty())
            throw new Exception("No female students");
        else
            return femaleSet.Any();
    }
}

Now your collection is no longer a "pure set", because you have a contract on it, and hence you should enforce your contract with a exception.

When you use your "set" functions in a pure way you should not throw exceptions in case of empty set, but if you have collections that are no more "pure sets" then you should throw exceptions where proper.

You should always do what feels more natural and consistent: to me a set should adhere to set rules, while things that are not sets should have their properly thought rules.

In your case it seems a good idea to do:

List SomeMethod( Set someInputSet){
    var result = someInputSet
        .CombinationsOf(4, CombinationsGenerationMode.Distinct)
        .Select(combo => /* do some operation to a combination */)
        .ToList();

    // the only information here is that set is empty => there are no combinations

    // BEWARE! if 0 here it may be invalid input, but also a empty set
    if(result.Count == 0)  //Add: "&&someInputSet.NotEmpty()"

    // we go a step further, our API require combinations, so
    // this method cannot satisfy the API request, then we throw.
         throw new Exception("you requsted impossible combinations");

    return result;
}

But it is not really a good idea, we have now a invalid state that can occurr at runtime at random moments, however that's implicit in the problem so we cannot remove it, sure we can move the exception inside some utility method (it is exactly the same code, moved in different places), but that's wrong and basically the best thing you can do is stick to regular set rules.

Infact adding new complexity just to show you can write linq queries methods seems not worth for your problem, I'm pretty sure that if OP can tell us more about it's domain, probably we could find the spot where the exception is really needed (if at all, it is possible the problem does not require any exception at all).

CoffeDeveloper
  • 551
  • 1
  • 4
  • 10
  • The problem is that you are using mathematically defined objects to resolve a problem, but the problem itself may not be requiring a mathematically defined object as answer (infact here returning a list). It all depends on the higher-level of the problem, regardless of how you solve it, that's called encapsulation/information hiding. – CoffeDeveloper Dec 01 '16 at 12:23
  • The reason for which your "SomeMethod" should no longer behave like a set is that you are asking a piece of information "out of Sets",specifically look at the commented part "&&someInputSet.NotEmpty()" – CoffeDeveloper Dec 01 '16 at 12:30
  • 1
    NO! You shouldn't throw an exception in your female class. You should use the Builder pattern which enforces that you can't even GET an instance of the `FemaleClass` unless it has only females (it's not publicly creatable, only the Builder class can hand out an instance), and possibly has at least one female in it. Then your code all over the place that takes `FemaleClass` as an argument doesn't have to handle exceptions because the object is in the wrong state. – ErikE Dec 03 '16 at 01:37
  • You are assuming that class is so simple that you can enforce its preconditions simply by constructor, in reality. You argument is flawed. If you forbid to no longer have females, then or you cannot have a method for removing studends from the class or your method has to throw a exception when there's 1 student left. You just have moved my problem elsewhere. The point is that there will be always some precondition/function state that cannot be maintained "by design" and that the user is going to break:that why Exceptions exist! This is pretty theoretical stuff,I hope the downvote is not yours. – CoffeDeveloper Dec 03 '16 at 02:07
  • The downvote is not mine, but if it was, I'd be *proud* of it. I downvote carefully but with conviction. If the rule of your class is that it MUST have an item in it, and all the items must meet a specific condition, then allowing the class to be in an inconsistent state is a bad idea. Moving the problem elsewhere is exactly my intention! I **want** the problem to occur at building time, not at use time. The point of using a Builder class instead of a constructor throwing is that you can build up a very complex state in many pieces and parts through inconsistencies. – ErikE Dec 03 '16 at 02:11
  • Nope!!!! WTF . When someone design a class he can put how many rules he want, your program your rules. I just showed 1 rule (Even stupid) that shows when and why exception is usefull. It does not necessarily needs to have a sense from an abstract point of view. My point is to show that you have to use exceptions only when preconditions you want are not the same as Sets – CoffeDeveloper Dec 03 '16 at 02:13
  • Exceptions definitely have their place, of course! I'm not slamming them as the right tool in some cases. Now, while this may be "pretty theoretical stuff" I have been programming for 30 years, admittedly only 13 or so in enterprise production-level environments, and my opinion is based on repeated, concrete non-theoretical experience over time. Whenever possible, letting your compiler enforce things for you instead of finding out problems at runtime is far superior. How can you not agree? Exceptions are a horrible way to ensure program correctness when compilation errors are possible. – ErikE Dec 03 '16 at 02:15
  • This has nothing to do with OP. If you have programmed for 30 years you know for sure than that not all problems can be catched at compile time. The OP question cannot be catched at compile time because is about the bheaviour of a method. The compiler do not know content of the set as long as the code is executed so this have nothing to do with OP and you are out of topic. – CoffeDeveloper Dec 03 '16 at 02:17
  • Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/49511/discussion-between-dariooo-and-erike). – CoffeDeveloper Dec 03 '16 at 02:18
  • See my answer on this page for how I would handle this situation. Notice at the top that I state what exceptions to continue throwing, but that the OP's situation calls for no exception. – ErikE Dec 03 '16 at 02:18