124

This question may sound dumb, but why does 0 evaluates to false and any other [integer] value to true is most of programming languages?

String comparison

Since the question seems a little bit too simple, I will explain myself a little bit more: first of all, it may seem evident to any programmer, but why wouldn't there be a programming language - there may actually be, but not any I used - where 0 evaluates to true and all the other [integer] values to false? That one remark may seem random, but I have a few examples where it may have been a good idea. First of all, let's take the example of strings three-way comparison, I will take C's strcmp as example: any programmer trying C as his first language may be tempted to write the following code:

if (strcmp(str1, str2)) { // Do something... }

Since strcmp returns 0 which evaluates to false when the strings are equal, what the beginning programmer tried to do fails miserably and he generally does not understand why at first. Had 0 evaluated to true instead, this function could have been used in its most simple expression - the one above - when comparing for equality, and the proper checks for -1 and 1 would have been done only when needed. We would have considered the return type as bool (in our minds I mean) most of the time.

Moreover, let's introduce a new type, sign, that just takes values -1, 0 and 1. That can be pretty handy. Imagine there is a spaceship operator in C++ and we want it for std::string (well, there already is the compare function, but spaceship operator is more fun). The declaration would currently be the following one:

sign operator<=>(const std::string& lhs, const std::string& rhs);

Had 0 been evaluated to true, the spaceship operator wouldn't even exist, and we could have declared operator== that way:

sign operator==(const std::string& lhs, const std::string& rhs);

This operator== would have handled three-way comparison at once, and could still be used to perform the following check while still being able to check which string is lexicographically superior to the other when needed:

if (str1 == str2) { // Do something... }

Old errors handling

We now have exceptions, so this part only applies to the old languages where no such thing exist (C for example). If we look at C's standard library (and POSIX one too), we can see for sure that maaaaany functions return 0 when successful and any integer otherwise. I have sadly seen some people do this kind of things:

#define TRUE 0
// ...
if (some_function() == TRUE)
{
    // Here, TRUE would mean success...
    // Do something
}

If we think about how we think in programming, we often have the following reasoning pattern:

Do something
Did it work?
Yes ->
    That's ok, one case to handle
No ->
    Why? Many cases to handle

If we think about it again, it would have made sense to put the only neutral value, 0, to yes (and that's how C's functions work), while all the other values can be there to solve the many cases of the no. However, in all the programming languages I know (except maybe some experimental esotheric languages), that yes evaluates to false in an if condition, while all the no cases evaluate to true. There are many situations when "it works" represents one case while "it does not work" represents many probable causes. If we think about it that way, having 0 evaluate to true and the rest to false would have made much more sense.

Conclusion

My conclusion is essentially my original question: why did we design languages where 0 is false and the other values are true, taking in account my few examples above and maybe some more I did not think of?

Follow-up: It's nice to see there are many answers with many ideas and as many possible reasons for it to be like that. I love how passionate you seem to be about it. I originaly asked this question out of boredom, but since you seem so passionate, I decided to go a little further and ask about the rationale behind the Boolean choice for 0 and 1 on Math.SE :)

Deduplicator
  • 8,591
  • 5
  • 31
  • 50
Morwenn
  • 1,736
  • 2
  • 13
  • 22
  • 34
    `strcmp()` is no good example for true or false, as it returns 3 different values. And you will be surprised when you start using a shell, where 0 means true and anything else means false. – ott-- May 15 '13 at 20:27
  • @ott-- Well, just asking about boolean values would be meaningless, my question exactly about why things are converted that way to boolean values :) Nice thing you cited Bash, I had almost forgotten about that one. – Morwenn May 15 '13 at 20:28
  • 54
    @ott--: In Unix shells, 0 means *success* and non-zero means *failure* -- not quite the same thing as "true" and "false". – Keith Thompson May 15 '13 at 21:27
  • 1
    @KeithThompson Depends whether the truth tables work with *success* and *failure* in Bash, right? – Morwenn May 15 '13 at 21:30
  • 1
    Just to mention that Lua does not imply that "0" corresponds to false. http://www.luafaq.org/gotchas.html#T2 – meisterluk May 15 '13 at 22:30
  • @MasonWheeler I could probably list the (reasonably popular) languages where `0 != FALSE` on the fingers of 1 hand -- Haskell, Ada and Java off the bat, maybe a couple of others. – TC1 May 15 '13 at 22:30
  • What are those languages you are talking about? In my personal experience, in most languages, `0` in a Boolean context is either **truthy** or a `TypeError`. The languages in which `0` is considered **falsy** seem to be a tiny minority, and the ones in which `0` actually *is* `false` can be counted on the fingers on one hand. – Jörg W Mittag May 16 '13 at 00:10
  • 16
    @KeithThompson: In Bash (and other shells), "success" and "failure" really are the same as "true" and "false". Consider, for example, the statement `if true ; then ... ; fi`, where `true` is a command that returns zero and this tells `if` to run `...`. – ruakh May 16 '13 at 05:12
  • 1
    @TC1: You can add Common Lisp to the list (no pun intended): `false` is represented by `nil` (the empty list). When considered as boolean values, both `0` and `1` are equivalent to `T` (`true`), as all other numbers. – Giorgio May 16 '13 at 08:31
  • 13
    There are no booleans in hardware whatsoever, only binary numbers, and in most historical ISAs a non-zero number is considered as "true" in all the conditional branching instructions (unless they're using flags instead). So, the low level languages are by all means obliged to follow the underlying hardware properties. – SK-logic May 16 '13 at 08:44
  • @Giorgio Not just Common, afaik, LISP in general, at least I'm sure of Scheme, Clojure and Elisp. They're all fairly similar on those kinds of things. Still, that's 4 then. – TC1 May 16 '13 at 09:17
  • @TC1: Right, in Scheme numbers are also equivalent to `true` = `#t`. Additionally, you have a special symbol for `false`, namely `#f`, which is not the same as the empty list `'()`. – Giorgio May 16 '13 at 09:47
  • 2
    @MasonWheeler Having a boolean type doesn't imply anything. For example python *does* have a `bool` type but comparisons/if conditions etc. can have any return value. – Bakuriu May 16 '13 at 12:10
  • On a logic perspective, `false` (the strongest of conditions) is stronger than `true` (the weakest of conditions). Representing a (binary) zero could be expressed as *there are no ones*, which is stronger, hence harder to satisfy, than *there could be ones*. – afsantos May 16 '13 at 14:47
  • I would think, and likely wrong :), that 0 is false and 1 is true (not case in c++, with any positive int true), comes from the days of computers being programmed electronically. That is to say electrical signal gets sent, or doesn't to a pin. Kind of like a lightswitch on is I, off is O. So 0 is no electric, 1 is electric flow – user60812 May 17 '13 at 12:47
  • You shouldn't evaluate integer values to booleans at all. It's like casting a type with a lot of possible values to a type with only two possible values. – Pieter B Sep 05 '13 at 07:02
  • 2
    `if (some_function() == TRUE)` I always cringe when I read code like this. That's just plain awkward. – JensG Nov 12 '13 at 20:03
  • Is it? In C, 0 is false, all other values are true. However in sh the unix shell, 0 is true, all other values are false. (C and sh where written around the same time, as part of Unix. Unix is mostly consistent, but not here.) – ctrl-alt-delor Jul 28 '14 at 14:15
  • Exit codes != `true/false`... _you_ may have designed your exit codes to react that way, but some of us use exit codes to determine all manner of things. Exit `0` _is_ the _standard_ return for a _successful exit_ of a program, but that doesn't somehow mean `0` in a shell somehow universally becomes synonymous with `true`. – kayleeFrye_onDeck Jun 07 '17 at 18:51
  • I find it odd how PayPal RESULT values use 1(or any non-zero) for not approved and 0 for approved which i related to 1(or non-zero) being false and 0 being true in this case(whether a transaction was successful without error or not): https://www.paypalobjects.com/en_US/vhelp/paypalmanager_help/result_values_for_transaction_declines_or_errors.htm – james Aug 02 '17 at 19:32
  • 1
    I lament my opinion that non-primes are false and primes are true has never really been taken seriously. – HostileFork says dont trust SE Jan 13 '20 at 02:36

15 Answers15

105

0 is false because they’re both zero elements in common semirings. Even though they are distinct data types, it makes intuitive sense to convert between them because they belong to isomorphic algebraic structures.

  • 0 is the identity for addition and zero for multiplication. This is true for integers and rationals, but not IEEE-754 floating-point numbers: 0.0 * NaN = NaN and 0.0 * Infinity = NaN.

  • false is the identity for Boolean xor (⊻) and zero for Boolean and (∧). If Booleans are represented as {0, 1}—the set of integers modulo 2—you can think of ⊻ as addition without carry and ∧ as multiplication.

  • "" and [] are identity for concatenation, but there are several operations for which they make sense as zero. Repetition is one, but repetition and concatenation do not distribute, so these operations don’t form a semiring.

Such implicit conversions are helpful in small programs, but in the large can make programs more difficult to reason about. Just one of the many tradeoffs in language design.

Jon Purdy
  • 20,437
  • 7
  • 63
  • 95
  • 2
    Nice that you mentioned lists. (BTW, `nil` is both the empty list `[]` and the `false` value in Common Lisp; is there a tendency to merge identities from different data types?) You still have to explain why it is natural to consider false as an additive identity and true as a multiplicative identity and not the other way around. Isn't possible to consider `true` as the identify for `AND` and zero for `OR`? – Giorgio May 16 '13 at 07:58
  • 3
    +1 for referring to similar identities. Finally an answer which doesn't just boil down to "convention, deal with it". – l0b0 May 16 '13 at 09:20
  • 5
    +1 for giving details of a concrete and very old maths in which this has been followed and long made sense – Jimmy Hoffa May 16 '13 at 15:22
  • Boolean OR has no inverse. How does that form a ring? – Siyuan Ren Sep 05 '13 at 03:09
  • I'm unclear whether the fact that they are isomorphic algebras is being proved or used as an assumption here. – Andyz Smith Oct 05 '13 at 03:48
  • 1
    This answer doesn't make sense. `true` is also the identity and the zero of semirings (Boolean and/or). There is no reason, appart convention, to consider `false` is closer to 0 than `true`. – TonioElGringo Nov 17 '15 at 16:08
  • 1
    @TonioElGringo: The difference between true and false is the difference between XOR and XNOR. One can form isomorphic rings using AND/XOR, where true is the multiplicative identity and false the additive one, or with OR and XNOR, where false is the multiplicative identity and true is the additive one, but XNOR is not usually regarded as a common fundamental operation the way XOR is. – supercat May 09 '17 at 17:10
  • @supercat: On the other hand, one could argue that the main reason why XNOR is not regarded as fundamental is because it corresponds to the notion of coincidence, which can already be rendered as ‘==’. And equality /is/ undoubtedly a fundamental operator… – Rémi Peyre Mar 24 '18 at 00:13
77

Because the math works.

FALSE OR TRUE is TRUE, because 0 | 1 is 1.

... insert many other examples here.

Traditionally, C programs have conditions like

if (someFunctionReturningANumber())

rather than

if (someFunctionReturningANumber() != 0)

because the concept of zero being equivalent to false is well-understood.

Deduplicator
  • 8,591
  • 5
  • 31
  • 50
Robert Harvey
  • 198,589
  • 55
  • 464
  • 673
  • 23
    The languages are designed like that because the math makes sense. That came first. – Robert Harvey May 15 '13 at 19:57
  • 28
    @Morwenn, it goes back to the 19th century and George Boole. People have been representing False as 0 and True as !0 for longer than there have been computers. – Charles E. Grant May 15 '13 at 20:03
  • 1
    @Morwenn, I would hope the people who designed the orginal languages understood the math. In fact I would bet on it since computer science came from the field of mathematics. People like this designed the orginal languages (note her mathematics experience) http://en.wikipedia.org/wiki/Grace_Hopper – HLGEM May 15 '13 at 21:19
  • 12
    I don't see why the math doesn't work the other way if you merely change all of the definitions so that AND is + and OR is *. – Neil G May 15 '13 at 21:49
  • 1
    @NeilG: I can call spaghetti macaroni, but it's still spaghetti, even though both are pasta. – Robert Harvey May 15 '13 at 21:49
  • 7
    Exactly: the math works both ways and the answer to this question seems to be that it is purely conventional. – Neil G May 15 '13 at 21:51
  • @NeilG: The rules aren't abitrary; there's a mathematical underpinning to them. – Robert Harvey May 15 '13 at 21:52
  • Why couldn't you define OR so that it corresponds to multiplication and AND so that it is a saturating addition? – Neil G May 15 '13 at 21:56
  • 6
    @Robert It'd be great if you could spell out the "mathematical underpinnings" in your post. – phant0m May 15 '13 at 21:59
  • @Neil G When OR is associated with addition, 0 corresponds to empty set. Choosing otherwise just does not seem natural. – artem May 15 '13 at 23:34
  • 2
    @artem: "does not seem natural" …because of convention. :) – Neil G May 16 '13 at 00:50
  • @Morwenn: Maybe not by all programmers, but by programming language designers. – Giorgio May 16 '13 at 01:20
  • @Neil G. No, not because of convention. How many elements are in empty set? – artem May 16 '13 at 01:56
  • @artem: Why not associate 1 with the empty set, and with False? – Neil G May 16 '13 at 06:13
  • @NeilG How does your reversed definition extend to the OP's question, which considers *any* nonzero value to represent true? If 0 is true and anything else is false (which seems to me a more accurate mirror image of the original question), am I missing something when I write `1 + (-1) == 0`, therefore `false and false == true`? Or is it just a case of having to choose operators with different semantics? And if it is, does that not get closer to answering why 0 is historically "false" - because the other way around requires "special" operators? – shambulator May 16 '13 at 08:24
  • 3
    @shambulator: In your example, you would have had the same problem with the regular correspondence; does True or True = False? The regular definition is 1: True, 0: False, Or: Saturating addition, And: Multiplication. In the reversed definition, we flip both pairs of labels. – Neil G May 16 '13 at 08:30
  • 3
    @NeilG Good point; I suspected I'd missed something important :P – shambulator May 16 '13 at 08:32
  • 2
    "FALSE OR TRUE is TRUE, because 0 | 1 is 1": I do not quite understand this: `0 | 1` is set to be equal `1` once we decide that `0` is `FALSE`, `1` is `TRUE`, and `|` is `OR`. That's why they are the same. Or how do you define `|` otherwise? – Giorgio May 16 '13 at 12:29
  • This argument just about works for abusing bitwise `|` as a Boolean operator but it breaks down for `&` and `^` because of the existence of distinct non-zero values. – Peter Taylor May 16 '13 at 12:51
  • 3
    @Giorgio: `0 | 1 = 0` is bitwise math. It's the way every processor on the planet calculates it. It doesn't require an assumption that 0 is false and 1 is true, although in practice that is the usual interpretation. – Robert Harvey May 16 '13 at 15:52
  • @Robert Harvey: Maybe we mean the same: Whatever names you use, the moment you have a set with two elements and the operations from boolean algebra (OR, AND, NOT) you will always get the same structure. This is why the bits in bitwise operations and the truth values have the same behaviour. – Giorgio May 16 '13 at 16:21
  • @NeilG because it's convenient to associate set with a number that equals to the number of elements contained in the set. Makes remembering rules much easier :-) – artem May 16 '13 at 21:33
40

As others have said, the math came first. This is why 0 is false and 1 is true.

Which math are we talking about? Boolean algebras which date from the mid 1800s, long before digital computers came along.

You could also say that the convention came out of propositional logic, which even older than boolean algebras. This is the formalization of a lot of the logical results that programmers know and love (false || x equals x, true && x equals x and so on).

Basically we're talking about arithmetic on a set with two elements. Think about counting in binary. Boolean algebras are the origin of this concept and its theoretical underpinning. The conventions of languages like C are just a straightforward application.

joshin4colours
  • 3,678
  • 1
  • 24
  • 37
  • Why couldn't you do the same Boolean algebra defining the operators so that they give the same results with 0 being true and 1 being false? – Neil G May 15 '13 at 21:40
  • 2
    You could, for sure. But keeping it the "standard" way fits in well with general arithmetic (0 + 1 = 1, not 0 + 1 = 0). – joshin4colours May 15 '13 at 21:46
  • 2
    Yes, but you would presumably write AND with + and OR with * if you reversed the definitions too. – Neil G May 15 '13 at 21:47
  • 3
    The math didn't come first. Math recognized that 0 and 1 form a field, in which AND is like multiplication and OR is like addition. – Kaz May 16 '13 at 00:57
  • 1
    @Kaz: But {0, 1} with OR and AND does not form a field. – Giorgio May 16 '13 at 11:20
  • 2
    It bothers me a little that more answers and comments say that `true = 1`. That's not quite exact, because `true != 0` which is not exactly the same. One reason (not the only one) why one should avoid comparisons like `if(something == true) { ... }`. – JensG Nov 12 '13 at 20:07
  • Actually I think that purely mathematical considerations should make it _more_ tempting to use the “reverse” convention 0 = TRUE and 1 = FALSE, because then AND plays the role of an addition, and that addition is rendered by the word “and” in everyday language… – Rémi Peyre Mar 24 '18 at 00:20
  • 1
    @Nancy-N - most ways of expressing logic that mathematicians have used over the years end up with the "and" operator having a higher binding precedence than the "or" operator (i.e. for whatever reason, mathematics has tended towards using what we generally call [sum of products](https://www.dyclassroom.com/boolean-algebra/sum-of-products-and-product-of-sums) form these days). These leads to a natural tendency to equate "and" with multiplication and "or" with addition. You can do it the other way around, and (due to de morgan's law) it's actually equivalent, but this is the conventional way. – Jules Jul 21 '18 at 01:27
  • @Jules This is a good point! :-) And, indeed, it seems easier to me to grasp the distributivity of AND w.r.t. OR than the distributivity of OR w.r.t. AND, which would justify taking AND as pseudo-multiplication and OR as pseudo-addition. (Even though this does not settle definitely the debate of whether it would be more relevant to render TRUE by 0 or by 1, as one could give other arguments in favour of the shell-like convention… ;-)). – Rémi Peyre Sep 19 '18 at 21:25
27

I thought this had to do with the "inheritance" from electronics, and also boolean algebra, where

  • 0 = off, negative, no, false
  • 1 = on, positive, yes, true

strcmp returns 0 when strings are equal has to do with its implementation, since what it actually does is to calculate the "distance" between the two strings. That 0 also happens to be considered false is just a coincidence.

returning 0 on success makes sense because 0 in this case is used to mean no error and any other number would be an error code. Using any other number for success would make less sense since you only have a single success code, while you can have several error codes. You use "Did it work?" as the if statement expression and say 0=yes would make more sense, but the expression is more correctly "Did anything go wrong?" and then you see that 0=no makes a lot of sense. Thinking of false/true doesn't really make sense here, as it's actually no error code/error code.

Svish
  • 1,092
  • 8
  • 14
  • Haha, you are the first one to state the return error question explicitely. I already knew I interpreted it my own way and and it could by asked the other way, but you're the first to explicitely express it (out of the many answers and comments). Actually, I wouldn't say that one or the other way makes no sense, but more that both make sense in different ways :) – Morwenn May 16 '13 at 11:11
  • 1
    Actually I'd say `0` for `success/no error` is the only thing that makes sense when other integers represent error codes. That `0` also happens to represent `false` in other cases doesn't really matter, since we aren't talking about true or false here at all ;) – Svish May 16 '13 at 12:09
  • I had the same idea so i upped – user60812 May 17 '13 at 12:49
  • 1
    Your point about `strcmp()` calculating the distance is quite good. If it had been called `strdiff()` then `if (!strdiff())` would be very logical. – Kevin Cox Jul 10 '14 at 21:53
  • "electronics [...] where 0 = [...] false, 1 = [...] true" - even in electronics, this is only a *convention*, and isn't the only one. We call this positive logic, but you can also use negative logic, where a positive voltage indicates false and negative indicates true. Then, the circuit you'd use for AND becomes OR, OR becomes AND, and so on. Due to De Morgan's law, it all ends up being equivalent. Sometimes, you'll find part of an electronic circuit implemented in negative logic for convenience, at which point the names of the signals in that part are noted with a bar above them. – Jules Jul 21 '18 at 01:34
  • @Jules I’m not even convinced it is true. I mean how would you know? How would I know how my phones processor represents 0 and 1, and whether it does that consistently? Or a quad level SSD drive? There’s something representing four bits, but heaven knows what. – gnasher729 Jul 09 '23 at 20:35
18

As explained in this article, the values false and true should not be confused with the integers 0 and 1, but may be identified with the elements of the Galois field (finite field) of two elements (see here).

A field is a set with two operations that satisfy certain axioms.

The symbols 0 and 1 are conventionally used to denote the additive and multiplicative identities of a field because the real numbers are also a field (but not a finite one) whose identities are the numbers 0 and 1.

The additive identity is the element 0 of the field, such that for all x:

x + 0 = 0 + x = x

and the multiplicative identity is the element 1 of the field, such that for all x:

x * 1 = 1 * x = x

The finite field of two elements has only these two elements, namely the additive identity 0 (or false), and the multiplicative identity 1 (or true). The two operations of this field are the logical XOR (+) and the logical AND (*).

Note. If you flip the operations (XOR is the multiplication and AND is the addition) then the multiplication is not distributive over addition and you do not have a field any more. In such a case you have no reason to call the two elements 0 and 1 (in any order). Note also that you cannot choose the operation OR instead of XOR: no matter how you interpret OR / AND as addition / multiplication, the resulting structure is not a field (not all inverse elements exist as required by the field axioms).

Regarding the C functions:

  • Many functions return an integer that is an error code. 0 means NO ERROR.
  • Intuitively, the function strcmp computes the difference between two strings. 0 means that there is no difference between two strings, i.e. that two strings are equal.

The above intuitive explanations can help to remember the interpretation of the return values, but it is even easier to just check the library documentation.

Giorgio
  • 19,486
  • 16
  • 84
  • 135
  • 1
    +1 for showing that if you arbitrarily swap these, the maths no longer work out. – Jimmy Hoffa May 16 '13 at 15:25
  • Original: Given the field with two elements and operations * and +, we identify True with 1 and False with 0. We identify AND with * and XOR with +. – Neil G May 17 '13 at 05:48
  • 2
    Flipped: Given a field with two elements and operations * and +, we identify True with 0 and False with 1. We identify OR with * and XOR with +. – Neil G May 17 '13 at 05:48
  • 1
    You will find that both of these identifications are done over the same field and both are consistent with the rules of Boolean logic. Your note is unfortunately incorrect :) – Neil G May 17 '13 at 05:49
  • @Neil G: I understand your construction now and I am trying to verify it. What I am missing is how True can be the identity of XOR since True XOR True = False. We both assume that True, False, and the operations AND, OR, NOT, XOR on these values are given, right? – Giorgio May 17 '13 at 07:18
  • Ah, sorry, I made a mistake. In the flipped field XNOR is identified with +. Good catch! – Neil G May 17 '13 at 07:29
  • 1
    If you assume that True = 0, and XOR is +, then True must be the identity for XOR. But it is not because True XOR True = False. Unless you redefine the operation XOR on True so that True XOR True = True. Then of course your construction works because you have just renamed things (in any mathematical structure you can always successfully make a name permutation and get an isomorphic structure). On the other hand, if you let True, False and XOR have their usual meaning, then True XOR True = False and True cannot be the additive identity, i.e. True cannot be 0. – Giorgio May 17 '13 at 07:30
  • 1
    @Giorgio: I corrected my construction per your comment in my last comment… – Neil G May 17 '13 at 07:30
  • 1
    That is exactly my answer I prepared to explain like this wow. – Abby Chau Yu Hoi May 20 '13 at 10:16
  • Shorter: - `XOR is equivalent to + %2` - `true XOR true = false` - Because `(1+1)%2 = 0`, but `(0+0)%2 = 0` => `true = 1` and `false = 0` – ROMANIA_engineer Jan 08 '15 at 09:49
16

You should consider that alternative systems can also be acceptable design decisions.

Shells: 0 exit status is true, non-zero is false

The example of shells treating a 0 exit status as true has already been mentioned.

$ ( exit 0 ) && echo "0 is true" || echo "0 is false"
0 is true
$ ( exit 1 ) && echo "1 is true" || echo "1 is false"
1 is false

The rationale there is that there is one way to succeed, but many ways to fail, so using 0 as the special value meaning "no errors" is pragmatic.

Ruby: 0 is just like any other number

Among "normal" programming languages, there are some outliers, such as Ruby, that treat 0 as a true value.

$ irb
irb(main):001:0> 0 ? '0 is true' : '0 is false'
=> "0 is true"

The rationale is that only false and nil should be false. For many Ruby novices, it's a gotcha. However, in some cases, it's nice that 0 is treated just like any other number.

irb(main):002:0> (pos = 'axe' =~ /x/) ? "Found x at position #{pos}" : "x not found"
=> "Found x at position 1"
irb(main):003:0> (pos = 'xyz' =~ /x/) ? "Found x at position #{pos}" : "x not found"
=> "Found x at position 0"
irb(main):004:0> (pos = 'abc' =~ /x/) ? "Found x at position #{pos}" : "x not found"
=> "x not found"

However, such a system only works in a language that is able to distinguish booleans as a separate type from numbers. In the earlier days of computing, programmers working with assembly language or raw machine language had no such luxuries. It is probably just natural to treat 0 as the "blank" state, and set a bit to 1 as a flag when the code detected that something happened. By extension, the convention developed that zero was treated as false, and non-zero values came to be treated as true. However, it doesn't have to be that way.

Java: Numbers cannot be treated as booleans at all

In Java, true and false are the only boolean values. Numbers are not booleans, and cannot even be cast into booleans (Java Language Specification, Sec 4.2.2):

There are no casts between integral types and the type boolean.

That rule just avoids the question altogether — all boolean expressions have to be explicitly written in the code.

Deduplicator
  • 8,591
  • 5
  • 31
  • 50
200_success
  • 1,568
  • 11
  • 20
  • 1
    [Rebol and Red](http://chat.stackoverflow.com/rooms/291/rebol-and-red) both treat 0-valued INTEGER! values as true, and have a separate NONE! type (with only one value, NONE) treated as conditional false in addition to LOGIC! false. I've found significant frustration in trying to write JavaScript code that treats 0 as false; it is an incredibly clunky decision for a dynamically-typed language. If you want to test something that can be null or 0 you wind up having to write `if (thing === 0)`, that is just not cool. – HostileFork says dont trust SE Apr 10 '14 at 20:52
  • @HostileFork I don't know. I find that it makes sense that `0` is `true` (as every other integer) in a dynamic language. I sometimes happened to catch a `0` when trying to catch `None` in Python, and that can sometimes be pretty hard to spot. – Morwenn Apr 10 '14 at 21:22
  • 2
    Ruby is not an outlier. Ruby takes this from Lisp (Ruby is even secretly called "MatzLisp"). Lisp is a mainstream language in computer science. Zero is also just a true value in the POSIX shell, because it's a piece of text: `if [ 0 ] ; then echo this executes ; fi`. The false data value is an empty string, and a testable falsehood is a failed termination status of a command, which is represented by a *non*-zero. – Kaz Jan 22 '15 at 19:42
8

Before addressing the general case, we can discuss your counter examples.

String comparisons

The same holds for many sorts of comparisons, actually. Such comparisons compute a distance between two objects. When the objects are equal, the distance is minimal. So when the "comparison succeeds", the value is 0. But really, the return value of strcmp is not a boolean, it is a distance, and that what traps unaware programmers doing if (strcmp(...)) do_when_equal() else do_when_not_equal().

In C++ we could redesign strcmp to return a Distance object, that overrides operator bool() to return true when 0 (but you would then be bitten by a different set of problems). Or in plain C just have a streq function that returns 1 when strings are equal, and 0 otherwise.

API calls/program exit code

Here you care about the reason something went wrong, because this will drive the decisions up on error. When things succeed, you don't want to know anything in particular - your intent is realized. The return value must therefore convey this information. It is not a boolean, it is an error code. The special error value 0 means "no error". The rest of the range represent locally meaningful errors you have to deal with (including 1, which often means "unspecified error").

General case

This leaves us with the question: why are boolean values True and False commonly represented with 1 and 0, respectively?

Well, besides the subjective "it feels better this way" argument, here are a few reasons (subjective as well) I can think of:

  • electrical circuit analogy. The current is ON for 1s, and OFF for 0s. I like having (1,Yes,True,On) together, and (0,No,False,Off), rather than another mix

  • memory initializations. When I memset(0) a bunch of variables (be them ints, floats, bools) I want their value to match the most conservative assumptions. E.g. my sum is initally 0, the predicate is False, etc.

Maybe all these reasons are tied to my education - if I had been taught to associate 0 with True from the beginning, I would go for the other way around.

  • 2
    Actually there is at least one programming language that treats 0 as true. The unix shell. – Jan Hudec May 16 '13 at 10:55
  • +1 for addressing the real issue: Most of Morwenn's question isn't about `bool` at all. – dan04 May 17 '13 at 14:02
  • @dan04 It is. The whole post is about the rationale behind the choice of the cast from `int` to `bool` in many programming languages. The comparison and error gestion stuff are just examples of places where casting it another way than the one it's currently done would have make sense. – Morwenn May 21 '13 at 06:40
6

From a high-level perspective, you're talking about three quite different data types:

  1. A boolean. The mathematical convention in Boolean algebra is to use 0 for false and 1 for true, so it makes sense to follow that convention. I think this way also makes more sense intuitively.

  2. The result of comparison. This has three values: <, = and > (notice that none of them is true). For them it makes sense to use the values of -1, 0 and 1, respectively (or, more generally, a negative value, zero and a positive value).

    If you want to check for equality and you only have a function that performs general comparison, I think you should make it explicit by using something like strcmp(str1, str2) == 0. I find using ! in this situation confusing, because it treats a non-boolean value as if it was a boolean.

    Also, keep in mind that comparison and equality don't have to be the same thing. For example, if you order people by their date of birth, Compare(me, myTwin) should return 0, but Equals(me, myTwin) should return false.

  3. The success or failure of a function, possibly also with details about that success or failure. If you're talking about Windows, then this type is called HRESULT and a non-zero value doesn't necessarily indicate failure. In fact, a negative value indicates failure and non-negative success. The success value is very often S_OK = 0, but it can also be for example S_FALSE = 1, or other values.

The confusion comes from the fact that three logically quite different data types are actually represented as a single data type (an integer) in C and some other languages and that you can use integer in an condition. But I don't think it would make sense to redefine boolean to make using some non-boolean types in conditions simpler.

Also, consider another type that's often used in a condition in C: a pointer. There, it's natural to treat a NULL-pointer (which is represented as 0) as false. So following your suggestion would also make working with pointers more difficult. (Though, personally, I prefer explicitly comparing pointers with NULL, instead of treating them as booleans.)

svick
  • 9,999
  • 1
  • 37
  • 51
5

There are a lot of answers that suggest that correspondance between 1 and true is necessitated by some mathematical property. I can't find any such property and suggest it is purely historical convention.

Given a field with two elements, we have two operations: addition and multiplication. We can map Boolean operations on this field in two ways:

Traditionally, we identify True with 1 and False with 0. We identify AND with * and XOR with +. Thus OR is saturating addition.

However, we could just as easily identify True with 0 and False with 1. Then we identify OR with * and XNOR with +. Thus AND is saturating addition.

Neil G
  • 438
  • 2
  • 14
  • 4
    If you had followed the link on wikipedia you could have found that the concept of a boolean algebra is closed related with that of a Galois field of two elements (http://en.wikipedia.org/wiki/GF%282%29). The symbols 0 and 1 are conventionally used to denote the additive and multiplicative identities, respectively, because the real numbers are also a field whose identities are the numbers 0 and 1. – Giorgio May 15 '13 at 22:08
  • @Giorgio: Great, add an answer indicating that it's historical convention and I'll upvote you. – Neil G May 15 '13 at 22:11
  • 1
    @NeilG I think Giorgio is trying to say it's more than just a convention. 0 and 1 in boolean algebra are basically the same as 0 and 1 in GF(2), which behave almost the same as 0 and 1 in real numbers with regards to addition and multiplication. – svick May 15 '13 at 23:47
  • 1
    @svick: No, because you can simply rename multiplication and saturating addition to be OR and AND and then flip the labels so that 0 is True and 1 is False. Giorgio is saying that it was a convention of Boolean logic, which was adopted as a convention of computer science. – Neil G May 16 '13 at 00:49
  • 1
    @Neil G: No, you cannot flip + and * and 0 and 1 because a field requires distributivity of multiplication over addition (see http://en.wikipedia.org/wiki/Field_%28mathematics%29), but if you set + := AND and * := XOR, you get T XOR (T AND F) = T XOR F = T, whereas (T XOR T) AND (T XOR F) = F AND T = F. Therefore by flipping the operations and the identities you do not have a field any more. So IMO defining 0 and 1 as the identities of an appropriate field seems to capture false and true pretty faithfully. – Giorgio May 16 '13 at 01:08
  • @Giorgio: I set saturating addition=And, and *=Or (not XOR). Xor is just addition. Then you will find that your relations work and all the regular properties are preserved. You can see the flipping as an application of De Morgan's identity if that is clearer for you. – Neil G May 16 '13 at 08:33
  • @Neil G: Is {true, false} with the operations OR, AND a field? Unless I overlooked something, it is not: the value `true` has no additive inverse if the addition is OR, because the additive identity is `false` and there is no inverse element `-true` of `true` such that `-true` OR `true` = `false`. That's why I was suggesting XOR: because {false, true} with the operations {XOR, AND} give a field. – Giorgio May 16 '13 at 09:42
  • @Giorgio: A field has operations + and * with members 0 and 1. You then identify True and False with 1 and 0 (or 0 and 1) and + and * with XOR and AND (or XOR and OR). If you like you can define saturating addition for the Boolean operator OR (or AND). Parentheses indicate the reversed case. – Neil G May 16 '13 at 09:56
  • @Giorgio: In other words, fields need not have operations like conjunction and disjunction, but rather must have addition and multiplication. We identify the former with the latter. – Neil G May 16 '13 at 09:58
  • @Neil G: I agree, you do not need conjunction and disjunction. As the wikipedia article suggests, you can interpret XOR as addition and AND as multiplication. It is not clear to me which alternative operations you are proposing. OR as addition and AND as multiplication? Or the other way round? Or other operations? – Giorgio May 16 '13 at 10:14
  • @Giorgio: No, XOR is always addition, but OR becomes multiplication when 0 becomes true and 1 false. – Neil G May 16 '13 at 10:29
  • @Neil G: But {F, T} with the operations OR / AND is not a field, because T has no inverse with respect to OR and F has no inverse with respect to AND. Therefore, no matter how you choose OR / AND to be addition / multiplication, the resulting structure is not a field. – Giorgio May 16 '13 at 10:46
  • @Girgio: That was also true in the original field. You do not identify OR and AND with addition multiplication. With the usual definition 0 corresponding to false and 1 to true, you identify XOR with addition and AND with multiplication. With the reversed definition you identify XOR with addition and OR with multiplication. – Neil G May 16 '13 at 14:24
  • let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/8792/discussion-between-neil-g-and-giorgio) – Neil G May 16 '13 at 14:24
  • 1
    @giorgio: I have edited the answer to make it obvious what is going on. – Neil G May 17 '13 at 05:55
  • Heh. This is the only answer in the thread that actually understands the question. – thb Jul 04 '23 at 12:03
4

Zero can be false because most CPU's have a ZERO flag that can be used to branch. It saves a compare operation.

Lets see why.

Some psuedocode, as the audience probably don't read assembly

c- source simple loop calls wibble 10 times

for (int foo =10; foo>0; foo-- ) /* down count loop is shorter */
{  
   wibble();
}

some pretend assembly for that

0x1000 ld a 0x0a      'foo=10
0x1002 call 0x1234    'call wibble()
0x1005 dec a          'foo--
0x1006 jrnz -0x06      'jump back to 0x1000 if not zero
0x1008  

c- source another simple loop calls wibble 10 times

for (int foo =0; foo<10; foo-- ) /* up count loop is longer  */
{  
   wibble();
}

some pretend assembly for this case

0x1000 ld a 0x00      'foo=0
0x1002 call 0x1234    'call wibble()
0x1005 dec a          'foo--
0x1006 cmp 0x0a       'compare foo to 10 ( like a subtract but we throw the result away)
0x1008 jrns -0x08      'jump back to 0x1000 if compare was negative
0x100a  

some more c source

int foo=10;
if ( foo ) wibble()

and the assembly

0x1000 ld a 0x10
0x1002 jz 0x3
0x1004 call 0x1234
0x1007  

see how short that is ?

some more c source

int foo=10;
if ( foo==0 ) wibble()

and the assembly (lets assume a marginally smart compiler that can replace ==0 with no compare)

0x1000 ld a 0x10
0x1002 jz 0x3
0x1004 call 0x1234
0x1007  

Now lets try a convention of true=1

some more c source #define TRUE 1 int foo=TRUE; if ( foo==TRUE ) wibble()

and the assembly

0x1000 ld a 0x1
0x1002 cmp a 0x01
0x1004 jz 0x3
0x1006 call 0x1234
0x1009 

see how short the case with nonzero true is ?

Really early CPU's had small sets of flags attached to the Accumulator.

To check if a>b or a=b generally takes a compare instruction.

  • Unless B is either ZERO - in which case the ZERO flag is set Implemented as a simple logical NOR or all bits in the Accumulator.
  • Or NEGATIVE in which just use the "sign bit" i.e. the most significant bit of the Accumulator if you are using two's complement arithmetic. (Mostly we do)

Lets restate this. On some older CPU's you did not have to use a compare instruction for accumulator equal to ZERO, or accumulator less than zero.

Now do you see why zero might be false?

Please note this is psuedo-code and no real instruction set looks quite like this. If you know assembly you know I'm simplifying things a lot here. If you know anything about compiler design, you didn't need to read this answer. Anyone who knows anything about loop unrolling or branch prediction, the advanced class is down the hall in room 203.

Deduplicator
  • 8,591
  • 5
  • 31
  • 50
Tim Williscroft
  • 3,563
  • 1
  • 21
  • 26
  • 2
    Your point is not well made here because for one thing `if (foo)` and `if (foo != 0)` should generate the same code, and secondly, you're showing that the assembly language you're using in fact has explicit boolean operands and tests for them. For instance `jz` means `jump if zero`. In other words `if (a == 0) goto target;`. And the quantity is not even being tested directly; the condition is converted its a boolean flag which is stored in a special machine word. It's actually more like `cpu.flags.zero = (a == 0); if (cpu.flags.zero) goto target;` – Kaz May 16 '13 at 17:02
  • No Kaz, the older CPU's did not work like that. The jz/jnz can be performed without doing a comparison instruction. Which was kind of the point of my whole post really. – Tim Williscroft May 19 '13 at 02:04
  • 2
    I didn't write anything about a comparison instruction. – Kaz May 19 '13 at 02:35
  • Can you cite a processor that has a `jz` instruction but no `jnz`? (or any other asymmetric set of conditional instructions) – Toby Speight Nov 09 '16 at 16:58
4

Strangely, zero is not always false.

In particular, the Unix and Posix convention is to define EXIT_SUCCESS as 0 (and EXIT_FAILURE as 1). Actually it is even a standard C convention!

So for Posix shells and exit(2) syscalls, 0 means "successful" which intuitively is more true than false.

In particular, the shell's if wants a process return EXIT_SUCCESS (that is 0) to follow its "then" branch!

In Scheme (but not in Common Lisp or in MELT) 0 and nil (i.e. () in Scheme) are true, since the only false value is #f

I agree, I am nitpicking!

Basile Starynkevitch
  • 32,434
  • 6
  • 84
  • 125
3

C is used for low-level programming close to hardware, an area in which you sometimes need to shift between bitwise and logical operations, on the same data. Being required to convert a numeric expression to boolean just to perform a test would clutter up the code.

You can write things like:

if (modemctrl & MCTRL_CD) {
   /* carrier detect is on */
}

rather than

if ((modemctrl & MCTRL_CD) != 0) {
    /* carrier detect is on */
}

In one isolated example it's not so bad, but having to do that will get irksome.

Likewise, converse operations. It's useful for the result of a boolean operation, like a comparison, to just produce a 0 or 1: Suppose we want to set the third bit of some word based on whether modemctrl has the carrier detect bit:

flags |= ((modemctrl & MCTRL_CD) != 0) << 2;

Here we have to have the != 0, to reduce the result of the biwise & expression to 0 or 1, but because the result is just an integer, we are spared from having to add some annoying cast to further convert boolean to integer.

Even though modern C now has a bool type, it still preserves the validity of code like this, both because it's a good thing, and because of the massive breakage with backward compatibility that would be caused otherwise.

Another exmaple where C is slick: testing two boolean conditions as a four way switch:

switch (foo << 1 | bar) {  /* foo and bar booleans are 0 or 1 */
case 0: /* !foo && !bar */
   break;
case 1: /* !foo && bar */
   break;
case 2: /* foo && !bar */
   break;
case 3: /* foo && bar */
   break;
}

You could not take this away from the C programmer without a fight!

Lastly, C sometimes serves as a kind of high level assembly language. In assembly languages, we also do not have boolean types. A boolean value is just a bit or a zero versus nonzero value in a memory location or register. An integer zero, boolean zero and the address zero are all tested the same way in assembly language instruction sets (and perhaps even floating point zero). Resemblance between C and assembly language is useful, for instance when C is used as the target language for compiling another language (even one which has strongly typed booleans!)

Deduplicator
  • 8,591
  • 5
  • 31
  • 50
Kaz
  • 3,572
  • 1
  • 19
  • 30
2

I think the real answer, which others have alluded to, is simple, pragmatic, and very old:

Because that's how you do it in assembly language.

Testing for 0 vs non-0 is done for you by almost all computer hardware (either by way of direct flag bits that track accumulator status or by condition-code registers that remember the result of a previous operation), and when combined with branches conditional on these bits results in smaller/faster programs. (Critically important back when memory and disks were small, and clock rates were low.) Counting and convergence loops both need this kind of decision for termination, and programs are usually filled with these. They need to be fast and efficient to be effective against the competition, so that's how general-purpose CPU's were built. By everybody.

Languages designed for systems programming tend to be lower level, less abstract (or capable of that, anyway), and have constructs that map fairly directly to their underlying assembly-language implementations. This encourages adoption of the language by those who might just as well have chosen to write in assembly language, but who are enticed by the numerous advantages of a (slightly?) higher-level language:

  • Code portability;
  • Storage allocation managed for you;
  • Register allocation and lifetime, if applicable, managed for you;
  • Branching and labels coded and managed for you;
  • Slightly higher abstraction, but still recognizable as 'the machine'.

Code written in these languages (BCPL, B, and early C, for example) is very 'friendly' for experienced assembly-language programmers. They're comfortable with the code that they know will be generated for them, and thankful that they didn't have to do it themselves. (And debug the inevitable mistakes they'd have made doing it.) Early adopters of said languages would have been poring over the code generated by the prospective compiler, until they became more comfortable just trusting it to do what they would otherwise have had to do the hard way. They would never have adopted the language if it did too many stupid things they didn't expect, during the language's probationary period with them. Basic decision making would have been high on their list of things that needed to be 'done right' if they were going to adopt the language.

All of BCPL, B, and C use the:

if (non-zero) then-its-True

construct, however it's spelled. This results in a single conditional-branch instruction after the evaluation of the condition expression; you really can't do it in less, so it would have had programmer approval. It's an unlikely target machine that would not have a BZ (or equivalent) instruction. The next crop of programmers, those for whom assembly-language was not the rock upon which all else was built, were just using the languages that had effectively been chosen for them by their predecessors, and perhaps did not understand and appreciate all the reasons their languages had the features they did.

I will submit that the rest of the languages (probably developed by these programmers) that treat (only) zero as false simply took it from C.

jimc
  • 119
  • 4
0

A boolean or truth value only has 2 values. True and false.

These should not be represented as integers, but as bits (0 and 1).

Saying any other integer beside 0 or 1 is not false is a confusing statement. Truth tables deal with truth values, not integers.

From a truth value prospective, -1 or 2 would break all truth tables and any boolean logic assoicated with them.

  • 0 AND -1 == ?!
  • 0 OR 2 == ?!

Most languages usually have a boolean type which when cast to a number type such as integer reveals false to be cast as a integer value of 0.

Jon Raynor
  • 10,905
  • 29
  • 47
  • 1
    0 AND -1 == whatever boolean value you cast them to. That's what my question is about, why casting them to `TRUE` or `FALSE`. Never did I say - maybe I did, but it was not intended - integers were true or false, I asked about why they do evaluate to whichever when casted to boolean. – Morwenn May 15 '13 at 21:25
-8

Ultimately, you are talking about breaking the core language because some APIs are crappy. Crappy APIs are not new, and you can't fix them by breaking the language. It is a mathematical fact that 0 is false and 1 is true, and any language which does not respect this is fundamentally broken. The three-way comparison is niche and has no business having it's result implicitly convert to bool since it returns three possible results. The old C APIs simply have terrible error handling, and are also hamstrung because C does not have the necessary language features to not have terrible interfaces.

Note that I am not saying that for languages which do not have implicit integer->boolean conversion.

DeadMG
  • 36,794
  • 8
  • 70
  • 139
  • 11
    "It is a mathematical fact that 0 is false and 1 is true" Erm. – R. Martinho Fernandes May 15 '13 at 19:55
  • 11
    Can you cite a reference for your "mathematical fact that 0 is false and 1 is true"? Your answer sounds dangerously like a rant. – Dan Pichelman May 15 '13 at 19:57
  • 14
    It's not a mathematical fact, but it's been a mathematical convention since the 19th century. – Charles E. Grant May 15 '13 at 20:06
  • 1
    Boolean algebra is represented by a finite field in which 0 and 1 are the identity elements for operations that resemble additon and multiplication. Those operations are, respectively, OR and AND. In fact, boolean algebra is written much like normal algebra where juxtaposition denotes AND, and the `+` symbol denotes OR. So for instance `abc + a'b'c` means `(a and b and c) or (a and (not b) and (not c))`. – Kaz May 16 '13 at 01:04