81

As noted by in the comments by @benjamin-gruenbaum this is called the Boolean trap:

Say I have a function like this

UpdateRow(var item, bool externalCall);

and in my controller, that value for externalCall will always be TRUE. What is the best way to call this function? I usually write

UpdateRow(item, true);

But I ask my self, should I declare a boolean, just to indicate what that 'true' value stands for? You can know that by looking the declaration of the function, but it's obviously faster and clearer if you just saw something like

bool externalCall = true;
UpdateRow(item, externalCall);

PD: Not sure if this question really fits here, if it doesn't, where could I get more info about this?

PD2: I didn't tag any language 'cause I thought it was a very generic problem. Anyway, I work with c# and the accepted answer works for c#

Andres F.
  • 5,119
  • 2
  • 29
  • 41
Mario Garcia
  • 813
  • 1
  • 6
  • 9
  • 11
    This pattern is also called [the boolean trap](https://ariya.io/2011/08/hall-of-api-shame-boolean-trap) – Benjamin Gruenbaum May 16 '18 at 21:15
  • 11
    If you're using a language with support for algebraic data types, I'd very much recommend a new adt instead of a boolean. `data CallType = ExternalCall | InternalCall` in haskell for example. – Filip Haglund May 17 '18 at 07:30
  • 6
    I guess Enums would fill the exact same purpose; getting names for the booleans, and a bit of type safety. – Filip Haglund May 17 '18 at 07:41
  • 2
    I disagree that declaring a boolean to indicate the meaning is "obviously clearer". With the first option, it's obvious that you will always pass true. With the second, you have to check (where is the variable defined? does its value change?). Sure, that's not a problem if the two lines are together... but somebody might decide that the perfect place to add their code is exactly between the two lines. It happens! – AJPerez May 17 '18 at 15:54
  • Declaring a boolean is not cleaner, clearer it's just jumping over a step which is, looking at the documentation of the method in question. If you know the signature, it's less clear to add a new constant. – insidesin May 18 '18 at 06:12
  • Is UpdateRow a method that you have access to? – Basilevs May 18 '18 at 10:12
  • TIP: If `externalCall` will be most of the times a known value, you could do something like this: `private void UpdateRow(var item, bool externalCall = true)`, this way you don't need to specify the `externalCall` argument everytime you use the method, and if you need to pass `false` can do `UpdateRow(item, false);` – auhmaan May 18 '18 at 11:10

9 Answers9

155

There isn't always a perfect solution, but you have many alternatives to choose from:

  • Use named arguments, if available in your language. This works very well and has no particular drawbacks. In some languages, any argument can be passed as a named argument, e.g. updateRow(item, externalCall: true) (C#) or update_row(item, external_call=True) (Python).

    Your suggestion to use a separate variable is one way to simulate named arguments, but does not have the associated safety benefits (there's no guarantee that you used the correct variable name for that argument).

  • Use distinct functions for your public interface, with better names. This is another way of simulating named parameters, by putting the paremeter values in the name.

    This is very readable, but leads to a lot of boilerplate for you, who is writing these functions. It also can't deal well with combinatorial explosion when there are multiple boolean arguments. A significant drawback is that clients can't set this value dynamically, but must use if/else to call the correct function.

  • Use an enum. The problem with booleans is that they are called "true" and "false". So, instead introduce a type with better names (e.g. enum CallType { INTERNAL, EXTERNAL }). As an added benefit, this increases the type safety of your program (if your language implements enums as distinct types). The drawback of enums is that they add a type to your publicly visible API. For purely internal functions, this doesn't matter and enums have no significant drawbacks.

    In languages without enums, short strings are sometimes used instead. This works, and may even be better than raw booleans, but is very susceptible to typos. The function should then immediately assert that the argument matches a set of possible values.

None of these solutions has a prohibitive performance impact. Named parameters and enums can be resolved completely at compile time (for a compiled language). Using strings may involve a string comparison, but the cost of that is negligible for small string literals and most kinds of applications.

amon
  • 132,749
  • 27
  • 279
  • 375
  • 7
    I just learned that "named arguments" exist. I think this is by far the best suggestion. If you can ellaborate a bit on that option (so that other persons don't need too google about it) I'll accept this answer. Also any hints on performance if you can...Thanks – Mario Garcia May 16 '18 at 14:16
  • 6
    One thing that can help alleviate the problems with short strings is to use constants with the string values. @MarioGarcia There's not much to elaborate on. Aside from what's mentioned here, most of the details will depend on the particular language. Was there something particular you wanted to see mentioned here? – jpmc26 May 16 '18 at 16:45
  • 8
    In languages where we're talking about a method _and_ the enum can be scoped to a class I generally use an enum. It is completely self documenting (in the single line of code that declares the enum type, so it is lightweight too), completely prevents the method from being called with a naked boolean, and the impact on the public API is negligible (since the identifiers are scoped to the class). – davidbak May 16 '18 at 17:38
  • 3
    In regards to using distinct methods, those methods can be implemented *internally* by calling a private method that takes a bool. This reduces the boiler plate code while keeping the expressiveness of the named methods. – RubberDuck May 16 '18 at 18:23
  • 4
    Another shortcoming of using strings in this role is that you can't check their validity at compile time, unlike enums. – Ruslan May 16 '18 at 18:40
  • @RubberDuck That internal method could also be public, which reduces boilerplate on both ends (no duplicated code and no extraneous `if`/`else` just to pass in a boolean by calling different functions) while allowing more expressive method names where it makes sense. – Nic May 16 '18 at 20:31
  • Named arguments have some burden when refactoring function bodies out. In addition, most current languages have first-class functions so the second point you make doesn't always necessitate if-else chains. – mike3996 May 17 '18 at 05:52
  • 1
    @jpmc26 If you're using constants, using an integral type instead of a string seems like it would make more sense. The biggest benefit of a string here is to make the code easier to read - that's irrelevant if you're just using the name of a constant instead. – Bernhard Barker May 17 '18 at 07:13
  • 2
    I almost always use enums. If you chose the right name you have no question on what the parameter stands for. – Toon Krijthe May 17 '18 at 07:52
  • 33
    **enum** is the winner. Just because a parameter can only have one of two possible states, that doesn't mean that it should automatically be a boolean. – Pete May 17 '18 at 09:12
  • 1
    It also depends on what happens with the boolean value inside the function - if there is no or little shared funcionality anyway, the distinct functions become more attractive. They also make static analysis of the call-graph more useful, making it possible to answer "which caller can ever pass true to this function" without investigating each call site. – Hulk May 17 '18 at 09:31
  • 1
    Great answer. I'd be tempted to add one more suggestion which whilst not the best solution in this case is pretty useful in most cases where clarity is importent - comment your code –  May 17 '18 at 09:34
  • 1
    @Dukeling Strings are much better for debugging. – jpmc26 May 17 '18 at 10:15
  • Since this isn't tagged with a language, event though Mario later mentioned he was using C#, I feel like it's worth adding in a bit about partially applied/curried functions, as in the add3 example here: https://stackoverflow.com/questions/36314/what-is-currying?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa – Ethan May 17 '18 at 21:55
  • Even if the language doesn't have named arguments, many languages allow an assignment to be used as an expression. So you can write `externalCall = true` and the assignment makes it self-documenting. – Barmar May 18 '18 at 03:30
  • 1
    If your language does not support named parameters, you can use a single object parameter, which comes close to named parameters. For instance the `OAuthBearerAuthenticationOptions` class in C# only exists to be the sole parameter to the `app.UseOAuthBearerAuthentication()` function. The `ExtJS` framework mainly uses object parameters. – Alexander May 18 '18 at 10:05
  • In a language such as PHP which doesn't have enums, if you use short strings you can define them as class constants on the class with the method being called. This allows use of auto completion for their names in an IDE. – bdsl May 20 '18 at 14:42
39

The correct solution is to do what you suggest, but package it into a mini-facade:

void updateRowExternally() {
  bool externalCall = true;
  UpdateRow(item, externalCall);
}

Readability trumps micro-optimization. You can afford the extra function call, certainly better than you can afford the developer effort of having to look up the semantics of the Boolean flag even once.

Kilian Foth
  • 107,706
  • 45
  • 295
  • 310
  • 9
    Is this encapsulation really worth it when I only call that funcion once in my controller? – Mario Garcia May 16 '18 at 13:17
  • 14
    Yes. Yes, it is. The reason, as I wrote above, is that the cost (a function call) is smaller than the benefit (you save a nontrivial amount of developer time because nobody ever has to look up what the `true`does.) It would be worthwhile even if it cost as much CPU time as it saves developer time, since CPU time is way cheaper. – Kilian Foth May 16 '18 at 13:21
  • 8
    Also any competent optimizer should be able to inline that for you so you shouldn't lose anything in the end. – firedraco May 16 '18 at 16:54
  • 5
    @MarioGarcia For a single call, using the `bool externalCall` variable or a comment should be enough, but the wrapper doesn't hurt a lot either. – Bergi May 16 '18 at 18:03
  • @firedraco Absolutely - even if the compiler couldn't in-line it, if it was used enough to be a performance impact, a JIT would. And even if it wouldn't, one extra layer of function calls is unlikely to be the biggest performance hit in your application... probably by a few orders of magnitude. And even IF it was, it's probably cheaper to throw another server at the problem than it is to pay a developer to deal with the fallout of making it harder to parse... – corsiKa May 16 '18 at 18:59
  • 33
    @KilianFoth Except that's a false assumption. A maintainer *does* need to know what the second parameter does, and why it's true, and almost always this relates to the detail of what you're trying to do. Hiding functionality in death-by-a-thousand-cuts tiny functions will decrease maintainability, not increase it. I've seen it happen myself, sad to say. Excessive partitioning into functions can actually be worse than a huge God Function. – Graham May 16 '18 at 19:41
  • 6
    I think `UpdateRow(item, true /*external call*/);` would be cleaner, in languages that allow that comment syntax. Bloating your code with an extra function just to avoid writing a comment doesn't seem worth it for a simple case. Maybe if there were lots of other args to this function, and / or some cluttered + tricky surrounding code, it would start to appeal more. But I think if you're debugging you're going to end up checking the wrapper function to see what it does, and have to remember which functions are thin wrappers around a library API, and which actually have some logic. – Peter Cordes May 17 '18 at 06:15
  • 1
    @PeterCordes The problem is that it doesn't help you spot the problem when something changes (I'm working on a huge legacy codebase where some methods only differ by their argument types even though they do something completely different). Enums and named parameters at least *help*. And yes, it's full of comments like in your example that haven't been true in years :D – Luaan May 19 '18 at 08:16
  • The equation "developer time vs CPU time" leans to one side for custom or in-house software, but to the other for side for FOSS, IMHO. – Pablo H May 20 '18 at 00:07
  • Robert Martin lists "Flag Arguments" as a Code Smell in his book _Clean Code_: "Boolean arguments loudly declare that the function does more than one thing. They are confusing and should be eliminated." and "Flag arguments are ugly. Passing a boolean into a function is a truly terrible practice. It immediately complicates the signature of the method, loudly proclaiming that this function does more than one thing. It does one thing if the flag is true and another if the flag is false!" Martin would separate these into `updateRowExternally` and `updateRowInternally`, as you have done. – CJ Dennis Jan 20 '19 at 00:13
25

Say I have a function like UpdateRow(var item, bool externalCall);

Why do you have a function like this?

Under what circumstances would you call it with the externalCall argument set to different values?

If one is from, say, an external, client application and the other is within the same program (i.e. different code modules), then I'd argue that you should have two different methods, one for each case, possibly even defined on distinct Interfaces.

If, however, you were making the call based on some datum taken from an non-program source (say, a configuration file or database read), then the Boolean-passing method would make more sense.

Phill W.
  • 11,891
  • 4
  • 21
  • 36
  • 6
    I agree. And just to hammer the point being made for other readers, an analysis can be made of all callers, who will pass either true, false, or a variable. If none of the latter is found the I would argue for two separate methods (w/o boolean) over one with the boolean -- it is a YAGNI argument that the boolean introduces unused/unecessary complexity. *Even if there are some callers are passing variables, I might still provide the (boolean) parameter-free versions to use for callers passing a constant -- it is simpler, which is good.* – Erik Eidt May 16 '18 at 14:40
17

While I agree it's ideal to use a language feature to enforce both readability and value-safety, you can also opt for a practical approach: call-time comments. Like:

UpdateRow(item, true /* row is an external call */);

or:

UpdateRow(item, true); // true means call is external

or (rightly, as suggested by Frax):

UpdateRow(item, /* externalCall */true);
bishop
  • 730
  • 6
  • 12
  • 1
    No reason to downvote. This is a perfectly valid suggestion. – Suncat2000 May 16 '18 at 20:21
  • Appreciate the <3 @Suncat2000! – bishop May 16 '18 at 20:28
  • 11
    A (likely preferable) variant is just putting the plain argument name there, like `UpdateRow(item, /* externalCall */ true )`. Full-sentence comment is much harder to parse, it's actually mostly noise (especially the second variant, which is also very loosely coupled with the argument). – Frax May 16 '18 at 20:51
  • 1
    This is by far the most appropriate answer. This is precisely what comments are intended for. – Sentinel May 16 '18 at 21:43
  • 9
    Not a big fan of comments like this. Comments are prone to becoming stale if left untouched when refactoring or making other changes. Not to mention as soon as you have more than one bool param this gets confusing fast. – John V. May 17 '18 at 04:47
  • @JohnV. Not to mention the fact that now you are forced to repeat the same comment wherever you are using this function. While this may be the simplest solution, it's far from the best one. – Eternal21 May 24 '18 at 14:20
11
  1. You can 'name' your bools. Below is an example for an OO language (where it can be expressed in the class providing UpdateRow()), however the concept itself can be applied in any language:

    class Table
    {
    public:
        static const bool WithExternalCall = true;
        static const bool WithoutExternalCall = false;
    

    and in the call site:

    UpdateRow(item, Table::WithExternalCall);
    
  2. I believe Item #1 is better but it doesn't force the user use new variables when using the function. If type safety is important for you, you can create an enum type and make UpdateRow() accept this instead of a bool:

    UpdateRow(var item, ExternalCallAvailability ec);

  3. You can modify the name of the function somehow that it reflects the meaning of the bool parameter better. Not very sure but maybe:

    UpdateRowWithExternalCall(var item, bool externalCall)

simurg
  • 322
  • 1
  • 4
11

Another option I haven’t read here yet is: Use a modern IDE.

For example IntelliJ IDEA prints the variable name of the variables in the method you’re calling if you’re passing a literal such as true or null or username + “@company.com. This is done in a small font so it doesn’t take up too much space on your screen and looks very different from actual code.

I'm still not saying it's a good idea to throw in booleans everywhere. The argument saying you read code far more often than you write is often very strong, but for this particular case it does heavily depend on the technology you (and your colleagues!) use to support your job. With an IDE it's far less of a problem than with for example vim.

Modified example of part of my test setup: enter image description here

  • 5
    This would only be true if everyone on your team only ever used an IDE to read your code. Notwithstanding the prevalence of non-IDE editors, you also have code review tools, source control tools, Stack Overflow questions, etc., and it's vitally important that code remain readable even in those contexts. – Daniel Pryden May 17 '18 at 17:56
  • @DanielPryden sure, hence my ‘and your colleagues’ comment. It’s common to have a standard IDE in companies. As for other tools, I would argue it’s entirely possible those kinds of tools also support this or will in the future, or that those arguments simply don’t apply (like for a 2 person pair programming team) – Sebastiaan van den Broek May 18 '18 at 00:35
  • 2
    @Sebastiaan: I don't think I agree. Even as a solo developer, I regularly read my own code in Git diffs. Git is never going to be able to do the same kind of context-aware visualization that an IDE can do, nor should it. I'm all for the use of a good IDE to help you write code, but you should never use an IDE as an excuse to write code that you wouldn't have written without it. – Daniel Pryden May 18 '18 at 11:50
  • 1
    @DanielPryden I’m not advocating writing extremely sloppy code. It’s not a black and white situation. There may be cases where in the past for a specific situation you would be 40% in favor of writing a function with a boolean and 60% not and you ended up not doing it. Maybe with good IDE support now the balance is 55% writing it like that and 45% not. And yes you may still need to read it outside of the IDE sometimes so it’s not perfect. But it’s still a valid trade-off against an alternative, like adding another method or enumeration to clarify the code otherwise. – Sebastiaan van den Broek May 18 '18 at 11:59
6

2 days and no ones mentioned polymorphism?

target.UpdateRow(item);

When I'm a client wanting to update a row with an item I don't want to think about the name of the database, the URL of the micro-service, the protocol used to communicate, or whether or not an external call is going to need to be used to make it happen. Stop pushing these details on me. Just take my item and go update a row somewhere already.

Do that and it becomes part of the construction problem. That can be solved many ways. Here's one:

Target xyzTargetFactory(TargetBuilder targetBuilder) {
    return targetBuilder
        .connectionString("some connection string")
        .table("table_name")
        .external()
        .build()
    ; 
}

If you're looking at that and thinking "But my language has named arguments, I don't need this". Fine, great, go use them. Just keep this nonsense out of the calling client that shouldn't even know what it's talking to.

It should be noted that this is more than a semantic problem. It's also a design problem. Pass in a boolean and you are forced to use branching to deal with it. Not very object oriented. Have two or more booleans to pass in? Wishing your language had multiple dispatch? Look into what collections can do when you nest them. The right data structure can make your life so much easier.

candied_orange
  • 102,279
  • 24
  • 197
  • 315
2

If you can modify the updateRow function, perhaps you can refactor it into two functions. Given that the function takes a boolean parameter, I suspect it looks like this:

function updateRow(var item, bool externalCall) {
  Database.update(item);

  if (externalCall) {
    Service.call();
  }
}

That can be a bit of a code smell. The function might have dramatically different behavior depending on what the externalCall variable is set to, in which case it has two different responsibilities. Refactoring it into two functions that only have one responsibility can improve readability:

function updateRowAndService(var item) {
  updateRow(item);
  Service.call();
}

function updateRow(var item) {
  Database.update(item);
}

Now, anywhere you call these functions, you can tell at a glance whether the external service is being called.

This isn't always the case, of course. It is situational and a matter of taste. Refactoring a function that takes a boolean parameter into two functions is usually worth considering, but isn't always the best choice.

Kevin
  • 731
  • 1
  • 4
  • 15
0

If UpdateRow is in a controlled codebase, I would consider strategy pattern:

public delegate void Request(string query);    
public void UpdateRow(Item item, Request request);

Where Request represents some kind of DAO (a callback in trivial case).

True case:

UpdateRow(item, query =>  queryDatabase(query) ); // perform remote call

False case:

UpdateRow(item, query => readLocalCache(query) ); // skip remote call

Usually, endpoint implementation would be configured on a higher level of abstraction and I would just use it here. As a useful free side-effect, this gives me an option to implement another way to access remote data, without any change to the code:

UpdateRow(item, query => {
  var data = readLocalCache(query);
  if (data == null) {
    data = queryDatabase(query);
  }
  return data;
} );

In general, such inversion of control ensures lower coupling between your data storage and model.

Basilevs
  • 1,221
  • 9
  • 11