0

If you want first element of list or array you reference it as 0 in many languages (like C or Clojure). Is there are some really good reasons why the programming languages was design this way?

In old days in assembly languages it makes perfect sense because all possible values needs to be used.

But what are nowadays to keep it this way? There is very little advantage when modulo arithmetic and ranges (Wikipedia article.) but not much more.

On a disadvantages side it should be: It makes confusion because it human language the first is connected with 1 (1st and it is not in english only). It makes confusion even in XPath (W3School:"Note: In IE 5,6,7,8,9 first node is[0], but according to W3C, it is [1]."). There are troubles between languages who use 1-based and 0-based system.

Want to know hat are the good reasons to use zero-based numbering and why even creator of new languages (like Clojure) choose this way?

boucekv
  • 401
  • 1
  • 5
  • 11
  • 6
    Consistency would be a *major* factor in this. Jokes about devs starting their count from 0 rather than 1 have existed for decades. Other would be that *it's actually more natural* since the location of an element in memory is location of list/array + n elements, where n is the element position. There would be quite a bit confusion there... – Ordous Jul 18 '14 at 10:39
  • 1
    Already answered [here](http://stackoverflow.com/questions/7320686/why-does-the-indexing-start-with-zero-in-c) I think. Not sure if duplicate, but at least a lomg discussion of the reasoning behind the 0 index. Also [here](http://programmers.stackexchange.com/questions/110804/why-are-zero-based-arrays-the-norm) the same thing. – thorsten müller Jul 18 '14 at 10:41
  • "It makes confusion because it human language the first is connected with 1 (1st and it is not in english only)" . It did for me when I started , but I rarely do such mistakes now. – Abhinav Gauniyal Jul 18 '14 at 10:53
  • 3
    Note that in much of the world, U.S. excepted, building floor numbering starts with 0 and it is quite natural for humans in those countries. – zaph Jul 18 '14 at 11:45
  • @Zaph - not really. In those languages you have two different words for ground floor and non-ground floor. So the first floor would be "ground", and then the count of non-ground floor would regularly start from 1. – Davor Ždralo Jul 18 '14 at 12:50
  • In an elevator the ground floor is denoted by a button with the numeral 0. Then both + and - numerals for floors above and below. So the first basement is -1 and this makes sense. In the U.S, the ground floor is denoted many different ways (L, M, G, etc) sometimes making it difficult to know which button is the the ground floor. – zaph Jul 18 '14 at 12:56

1 Answers1

3

Yes, there is a reason:

If you have a 1-based address, the starting (first) element of array anyway lies at zero address. It is a contradiction, that disappears if you have a 0-based address. And for the last you have one operation less when counting the address of the element. So, C in array addresses counting is more effective than much older 1-based FORTRAN-4.

Of course, in ours days the reasons must be based on human convenience. And counting row = natural numbers sequence starts at 1. And anyone of us started to count from 1 in childhood.

But people who makes new languages are thinking on two more reasons - grammatic simplicity of the language and customs of probable future users. And if you want to get C users, you'll introduce 0-based address.

And, of course, one more, purely subjective reason - language authors can simply like or dislike one of these schemes.

Gangnus
  • 2,805
  • 4
  • 21
  • 31
  • 1
    -1. Arrays could've been defined such that `a == &a[1]` is true and `a[0]` or `*(a + 0)` is undefined behavior; the "zero address" is a non-issue. [Dijkstra's explanation](http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html) continues to be the most compelling: for an interval with `N` integers, starting at `1` makes the upper end of your interval `N + 1` while starting at `0` makes it `N`. – Doval Jul 18 '14 at 12:06
  • As for intervals, yes, but for segments, it would end at Nth element for 1-based addresation. Why intervals are so important? As for your definitions, they are irrelevant. For 1-based address you'll still have to substract size(element) in machine code. – Gangnus Jul 18 '14 at 12:46
  • Think of setting up a C-style for loop: if you include the upper bound in the interval, e.g. `for (int i = min; i <= max; i++)` then to define an empty interval `max` must be `min - 1`. This poses a problem when `min = 0`: the interval is `[0, -1]` which means you can't use an unsigned index. Additionally, to figure out the number of iterations the formula becomes `upper bound - lower bound + 1` instead of just `upper bound - lower bound`. So you want to exclude the upper bound, which brings us back to the upper bound being `N + 1` if you start at 1 as opposed to `N` if you start at 0. – Doval Jul 18 '14 at 13:00
  • In short, defining an interval for N indices as `[0, N)` has the best properties: the upper bound is the number of indices/iterations, and you don't have to go negative when you want to iterate 0 times. Less mental gymnastics or risk of an off-by-one error. – Doval Jul 18 '14 at 13:03
  • @Doval You are really funny. 1. You are proving the qualities of C thinking basing on C thinking. Base on machine code point of view. 2. What you are arguing against? Where is your point? I don't see it. – Gangnus Jul 24 '14 at 08:43
  • I'm not basing anything on C thinking. I'm basing it on Dijkstra's arguments for why `[0, N)` is the best representation for an interval of `N` integers. It then follows that the indexing operation for *any* data structure should start at 0. You can take up your objections with Dijkstra, but given that [arrogance in computer science is measured in nano-Dijkstras](http://en.wikiquote.org/wiki/Talk:Edsger_W._Dijkstra), I suspect your debate with his ghost won't go well. Machine code is irrelevant here because no one thinks in machine code and the compiler can trivially do the offsets. – Doval Jul 24 '14 at 12:21
  • @Doval But the offset will be done. That is the difference. And again, where are the points you are arguing for and against? I can't notice them. – Gangnus Jul 28 '14 at 13:36
  • Why do you care if the compiler does some math at compile time? There would be no performance hit from the compiler subtracting one from the array index when translating from source code to machine code. You'll still get an offset of 0 regardless of which number you decide to use as the first index for arrays. – Doval Jul 28 '14 at 21:41
  • @Doval 2. What compile time? Many languages are interpreted ones. 1. What are you arguing for and against? I am asking this for the third time. You are merely continuing absolutely senseless discussion, jumping from one pointless argument to another. – Gangnus Jul 29 '14 at 10:51