8

Java has the following syntax for different bases:

int x1 = 0b0101; //binary
int x2 = 06;     //octal
int x3 = 0xff;   //hexadecimal

Is there any reasoning on why it is 0 instead of something like 0o like what they do for binary and hexadecimal? Is there any documentation on why they went this route?

Robert Harvey
  • 198,589
  • 55
  • 464
  • 673
Danny
  • 183
  • 1
  • 4
  • 5
    Consistency with older languages. Same reason perl, python, javascript, C++, clojure, etc... do it too. –  Dec 18 '13 at 20:52
  • 1
    @MichaelT: make that the answer, I think. Maybe add a link to K&R and say "C, there". – Móż Dec 18 '13 at 20:58
  • 4
    @Ӎσᶎ tad busy irl at the moment to do a good answer, and it likely goes back much further than C (that C did it for the same reason too). StackOverflow's got a reasonable answer http://stackoverflow.com/questions/11483216/why-are-leading-zeroes-used-to-represent-octal-numbers –  Dec 18 '13 at 21:01
  • 2
    @MichaelT By the way, Python ditched that syntax in favor of OP's proposal (`0o12345670`) as one of the backwards-incompatible changes of 3.0. –  Dec 18 '13 at 21:09
  • @MichaelT thanks. I tried searching on stackoverflow for an answer to this too but couldn't find anything. Seems you are a better searcher. – Danny Dec 18 '13 at 21:28
  • 3
    @Danny the key to searching well is knowing what you are looking for before searching. This isn't always practical if you don't know where to start from. The key wording is 'leading zero' - searching google for 'leading zero octal' gives that SO question. Another search I did (didn't find the answer I was looking for) was 'octal literal algol' which took me to http://rosettacode.org/wiki/Literals/Integer which is a most interesting read. –  Dec 18 '13 at 21:39

1 Answers1

9

Java syntax was designed to be close to that of C, see eg page 20 at How The JVM Spec Came To Be keynote from the JVM Languages Summit 2008 by James Gosling (20_Gosling_keynote.pdf):

  • C syntax to make developers comfortable

In turn, this is the way how octal constants are defined in C language:

If an integer constant begins with 0x or 0X, it is hexadecimal. If it begins with the digit 0, it is octal. Otherwise, it is assumed to be decimal...

Given above, it is natural that Java language designers decided to use same syntax as in C.


As pointed in this comment, StackOverflow's got a reasonable answer here:

All modern languages import this convention from C, which imported it from B, which imported it from BCPL.

Except BCPL used #1234 for octal and #x1234 for hexadecimal. B has departed from this convention because # was an unary operator in B (integer to floating point conversion), so #1234 could not be used, and # as a base indicator was replaced with 0.

The designers of B tried to make the syntax very compact. I guess this is the reason they did not use a two-character prefix.

gnat
  • 21,442
  • 29
  • 112
  • 288
  • 1
    You can even go back further. Since C was derived from B, you will find an explanation in [Thompson's B Manual](http://cm.bell-labs.com/cm/cs/who/dmr/kbman.html) section 4.1 Primary Expressions: Quote: "An octal constant is the same as a decimal constant except that it begins with a zero. It is then interpreted in base 8. Note that 09 (base 8) is legal and equal to 011. A character constant is represented by ' followed by one or two characters (possibly escaped) followed by another '. It has an rvalue equal to the value of the characters packed and right adjusted. " – Jérôme Dec 18 '13 at 22:19
  • @Jérôme yup I didn't go deeper intentionally after getting original problem answered. As for the history of how C got that octal syntax, that would be a **[different question](http://meta.stackexchange.com/questions/43478/exit-strategies-for-chameleon-questions "at Stack Exchange sites, “chameleon questions” are not quite welcome")**, that probably would better be asked and answered separately (and BTW it already [was](http://programmers.stackexchange.com/questions/221797/reasoning-behind-the-syntax-of-octal-notation-in-java/221800#comment440989_221797 "as pointed in comments")). – gnat Dec 18 '13 at 22:45
  • @Jérôme "Note that 09 (base 8) is legal", that make no sense to me. Why would `09` be legal if the `0` is meant to represent octal constants. – Danny Dec 20 '13 at 13:36
  • @Danny offering pragmatic convenience and small syntactic perks for programmers like legalizing `09` is probably what made C language win over pure, strict, logical and oh-so-friggin' scientific crap like Pascal and other Modulas – gnat Dec 20 '13 at 13:41
  • No, C did not import octal from B. C was developed on the PDP-11, and the natural form of binary values on that machine was octal. Assembly language was full of it. Octal shows up again in Unix permissions. Nobody even considered hex in the DEC world at that time. – david.pfx Feb 19 '14 at 09:53
  • I wonder what fractions of C programmers are "comfortable" with leading-zero notation for octal, and what fraction would like to see it deprecated in favor of `0q123` style [require all integer numeric literals which start with a leading zero to include a base specifier except when compiling in "compatibility" mode]? Gosling's decision to include that horrible notation in Java while failing to include useful things like an unsigned byte type (unsigned values only cause trouble if they're as big as `int`) is boggling. – supercat Mar 11 '14 at 23:36