100

The definition of "C-Style language" can practically be simplified down to "uses curly braces ({})." Why do we use that particular character (and why not something more reasonable, like [], which doesn't require the shift key at least on US keyboards)?

Is there any actual benefit to programmer productivity that comes from these braces, or should new language designers look for alternatives (i.e. the guys behind Python)?

Wikipedia tells us that C uses said braces, but not why. A statement in Wikipedia article on the List of C-based programming languages suggests that this syntax element is somewhat special:

Broadly speaking, C-family languages are those that use C-like block syntax (including curly braces to begin and end the block)...

gnat
  • 21,442
  • 29
  • 112
  • 288
SomeKittens
  • 4,220
  • 6
  • 31
  • 38
  • 36
    The only person who can answer this is Dennis Ritchie and he's dead. A reasonable guess is that [] were already taken for arrays. – Dirk Holsopple Feb 26 '13 at 15:02
  • 2
    @DirkHolsopple So he left no reasoning behind? Drat. Also: two downvotes on something I'm genuinely curious about? Thanks guys.... – SomeKittens Feb 26 '13 at 15:03
  • 1
    Please continue the discussion about this question [in this Meta question](http://meta.programmers.stackexchange.com/questions/5628/content-dispute-for-why-do-programming-languages-especially-c-use-curly-brace). – Thomas Owens Feb 26 '13 at 19:46
  • 2
    I have unlocked this post. Please keep any comments about the question and discussion about appropriateness on [the Meta question](http://meta.programmers.stackexchange.com/questions/5628/content-dispute-for-why-do-programming-languages-especially-c-use-curly-brace). – Thomas Owens Feb 28 '13 at 13:58
  • 5
    It probably also has something to do with the fact that curly braces are used in set notation in mathematics, making them somewhat awkward to use for array element access, rather than things like declaring "set"-ish things like structs, arrays, etc. Even modern languages like Python use curly braces to declare sets and dictionaries. The question then, is why did C also use curly braces to declare scope? Probably because the designers just didn't like the known alternatives, like BEGIN/END, and overloading array access notation ([]) was deemed less aesthetically sound than set notation. – Charles Salvia Sep 25 '13 at 07:03

3 Answers3

103

Two of the major influences to C were the Algol family of languages (Algol 60 and Algol 68) and BCPL (from which C takes its name).

BCPL was the first curly bracket programming language, and the curly brackets survived the syntactical changes and have become a common means of denoting program source code statements. In practice, on limited keyboards of the day, source programs often used the sequences $( and $) in place of the symbols { and }. The single-line '//' comments of BCPL, which were not taken up in C, reappeared in C++, and later in C99.

From http://www.princeton.edu/~achaney/tmve/wiki100k/docs/BCPL.html

BCPL introduced and implemented several innovations which became quite common elements in the design of later languages. Thus, it was the first curly bracket programming language (one using { } as block delimiters), and it was the first language to use // to mark inline comments.

From http://progopedia.com/language/bcpl/

Within BCPL, one often sees curly braces, but not always. This was a limitation of the keyboards at the time. The characters $( and $) were lexicographically equivalent to { and }. Digraphs and trigraphs were maintained in C (though a different set for curly brace replacement - ??< and ??>).

The use of curly braces was further refined in B (which preceded C).

From Users' Reference to B by Ken Thompson:

/* The following function will print a non-negative number, n, to
  the base b, where 2<=b<=10,  This routine uses the fact that
  in the ASCII character set, the digits 0 to 9 have sequential
  code values.  */

printn(n,b) {
        extern putchar;
        auto a;

        if(a=n/b) /* assignment, not test for equality */
                printn(a, b); /* recursive */
        putchar(n%b + '0');
}

There are indications that curly braces were used as short hand for begin and end within Algol.

I remember that you also included them in the 256-character card code that you published in CACM, because I found it interesting that you proposed that they could be used in place of the Algol 'begin' and 'end' keywords, which is exactly how they were later used in the C language.

From http://www.bobbemer.com/BRACES.HTM


The use of square brackets (as a suggested replacement in the question) goes back even further. As mentioned, the Algol family influenced C. Within Algol 60 and 68 (C was written in 1972 and BCPL in 1966), the square bracket was used to designate an index into an array or matrix.

BEGIN
  FILE F(KIND=REMOTE);
  EBCDIC ARRAY E[0:11];
  REPLACE E BY "HELLO WORLD!";
  WRITE(F, *, E);
END.

As programmers were already familiar with square brackets for arrays in Algol and BCPL, and curly braces for blocks in BCPL, there was little need or desire to change this when making another language.


The updated question includes an addendum of productivity for curly brace usage and mentions python. There are some other resources that do this study though the answer boils down to "Its anecdotal, and what you are used to is what you are most productive with." Because of the widely varying skills in programming and familiarity with different languages, these become difficult to account for.

See also: Stack Overflow Are there statistical studies that indicates that Python is “more productive”?

Much of the gains would be dependent on the IDE (or lack of) that is used. In vi based editors, putting the cursor over one matching open/close and pressing % will then move the cursor to the other matching character. This is very efficient with C based languages back in the old days - less so now.

A better comparison would be between {} and begin/end which was the options of the day (horizontal space was precious). Many Wirth languages were based on a begin and end style (Algol (mentioned above), pascal (many are familiar with), and the Modula family).

I have difficulty finding any that isolate this specific language feature - at best I can do is show that the curly brace languages are much more popular than begin end languages and it is a common construct. As mentioned in Bob Bemer link above, the curly brace was used to make it easier to program as shorthand.

From Why Pascal is Not My Favorite Programming Language

C and Ratfor programmers find 'begin' and 'end' bulky compared to { and }.

Which is about all that can be said - its familiarity and preference.

  • 15
    Now everybody here is learning BCPL instead of working :) – Denys Séguret Feb 26 '13 at 15:36
  • The trigraphs (introduced in the 1989 ISO C standard) for `{` and `}` are `??<` and `??>`. The digraphs (introduced by the 1995 amendment) are `<%` and `%>`. Trigraphs are expanded in all contexts, in a very early translation phase. Digraphs are tokens, and are not expanded in string literals, character constants, or comments. – Keith Thompson Feb 26 '13 at 16:45
  • There existed something prior to 1989 for this in C (I'd have to dig out my first edition book to get a date on that). Not all EBCDIC code pages had a curly brace (or square brackets) in them, and there were provisions for this in the earliest C compilers. –  Feb 26 '13 at 18:31
  • @NevilleDNZ BCPL used curly braces in 1966. Where Algol68 got its notion from would be something to explore - but BCPL didn't get it from Algo68. The ternary operator is something I've been interested in and have tracked it back to CPL (1963) (the predecessor of BCPL) which borrowed the notion from Lisp (1958). –  Feb 27 '13 at 00:19
  • 1968: Algol68 permits _round_ brackets ( ~ ) as an shorthand of **begin** ~ **end** _bold_ symbol blocks. These are called _brief_ symbols, c.f. [wp:Algol68 Bold symbols](http://en.wikipedia.org/wiki/ALGOL_68#Bold_symbols_and_reserved_words), this allows blocks of code to be treated just like _expressions_. A68 also has _brief_ shorthands like C's [?: ternary operator](http://en.wikipedia.org/wiki/%3F:#ALGOL_68) eg `x:=(c|s1|s2)` instead of C's `x=c?s1|s2`. Similarly this applies to **if** & **case** statements. ¢ BTW: A68 is from where the shell got it's **esac** & **fi** ¢ – NevilleDNZ Feb 27 '13 at 00:22
  • The "Are there statistical studies" question you linked has been closed and deleted. Most of the answers were "it's all anecdotal", but there was a paper linked which is a detailed statistical analysis of language productivity metrics based on Rosetta Code snippets: https://arxiv.org/pdf/1409.0252.pdf – Joey Adams Sep 06 '17 at 18:25
25

Square braces [] are easier to type, ever since IBM 2741 terminal that was "widely used on Multics" OS, which in turn had Dennis Ritchie, one of C language creators as dev team member.

http://upload.wikimedia.org/wikipedia/commons/thumb/9/9f/APL-keybd2.svg/600px-APL-keybd2.svg.png

Note the absence of curly braces at IBM 2741 layout!

In C, square braces are "taken" as these are used for arrays and pointers. If language designers expected arrays and pointers to be more important / used more frequently than code blocks (which sounds like a reasonable assumption at their side, more on historic context of coding style below), that would mean curly braces would go to "less important" syntax.

Importance of arrays is pretty apparent in the article The Development of the C Language by Ritchie. There's even an explicitly stated assumption of "prevalence of pointers in C programs".

...new language retained a coherent and workable (if unusual) explanation of the semantics of arrays... Two ideas are most characteristic of C among languages of its class: the relationship between arrays and pointers... The other characteristic feature of C, its treatment of arrays... has real virtues. Although the relationship between pointers and arrays is unusual, it can be learned. Moreover, the language shows considerable power to describe important concepts, for example, vectors whose length varies at run time, with only a few basic rules and conventions...


For further understanding of historical context and coding style of the time when C language was created, one needs to take into account that "origin of C is closely tied to the development of the Unix" and, specifically, that porting OS to a PDP-11 "led to the development of an early version of C" (quotes source). According to Wikipedia, "in 1972, Unix was rewritten in the C programming language".

Source code of various old versions of Unix is available online, eg at The Unix Tree site. Of various versions presented there, most relevant seems to be Second Edition Unix dated 1972-06:

The second edition of Unix was developed for the PDP-11 at Bell Labs by Ken Thompson, Dennis Ritchie and others. It extended the First Edition with more system calls and more commands. This edition also saw the beginning of the C language, which was used to write some of the commands...

You can browse and study C source code from Second Edition Unix (V2) page to get an idea of typical coding style of the time.

A prominent example that supports the idea that back then it was rather important for programmer to be able to type square brackets with ease can be found in V2/c/ncc.c source code:

/* C command */

main(argc, argv)
char argv[][]; {
    extern callsys, printf, unlink, link, nodup;
    extern getsuf, setsuf, copy;
    extern tsp;
    extern tmp0, tmp1, tmp2, tmp3;
    char tmp0[], tmp1[], tmp2[], tmp3[];
    char glotch[100][], clist[50][], llist[50][], ts[500];
    char tsp[], av[50][], t[];
    auto nc, nl, cflag, i, j, c;

    tmp0 = tmp1 = tmp2 = tmp3 = "//";
    tsp = ts;
    i = nc = nl = cflag = 0;
    while(++i < argc) {
        if(*argv[i] == '-' & argv[i][1]=='c')
            cflag++;
        else {
            t = copy(argv[i]);
            if((c=getsuf(t))=='c') {
                clist[nc++] = t;
                llist[nl++] = setsuf(copy(t));
            } else {
            if (nodup(llist, t))
                llist[nl++] = t;
            }
        }
    }
    if(nc==0)
        goto nocom;
    tmp0 = copy("/tmp/ctm0a");
    while((c=open(tmp0, 0))>=0) {
        close(c);
        tmp0[9]++;
    }
    while((creat(tmp0, 012))<0)
        tmp0[9]++;
    intr(delfil);
    (tmp1 = copy(tmp0))[8] = '1';
    (tmp2 = copy(tmp0))[8] = '2';
    (tmp3 = copy(tmp0))[8] = '3';
    i = 0;
    while(i<nc) {
        if (nc>1)
            printf("%s:\n", clist[i]);
        av[0] = "c0";
        av[1] = clist[i];
        av[2] = tmp1;
        av[3] = tmp2;
        av[4] = 0;
        if (callsys("/usr/lib/c0", av)) {
            cflag++;
            goto loop;
        }
        av[0] = "c1";
        av[1] = tmp1;
        av[2] = tmp2;
        av[3] = tmp3;
        av[4] = 0;
        if(callsys("/usr/lib/c1", av)) {
            cflag++;
            goto loop;
        }
        av[0] = "as";
        av[1] = "-";
        av[2] = tmp3;
        av[3] = 0;
        callsys("/bin/as", av);
        t = setsuf(clist[i]);
        unlink(t);
        if(link("a.out", t) | unlink("a.out")) {
            printf("move failed: %s\n", t);
            cflag++;
        }
loop:;
        i++;
    }
nocom:
    if (cflag==0 & nl!=0) {
        i = 0;
        av[0] = "ld";
        av[1] = "/usr/lib/crt0.o";
        j = 2;
        while(i<nl)
            av[j++] = llist[i++];
        av[j++] = "-lc";
        av[j++] = "-l";
        av[j++] = 0;
        callsys("/bin/ld", av);
    }
delfil:
    dexit();
}
dexit()
{
    extern tmp0, tmp1, tmp2, tmp3;

    unlink(tmp1);
    unlink(tmp2);
    unlink(tmp3);
    unlink(tmp0);
    exit();
}

getsuf(s)
char s[];
{
    extern exit, printf;
    auto c;
    char t, os[];

    c = 0;
    os = s;
    while(t = *s++)
        if (t=='/')
            c = 0;
        else
            c++;
    s =- 3;
    if (c<=8 & c>2 & *s++=='.' & *s=='c')
        return('c');
    return(0);
}

setsuf(s)
char s[];
{
    char os[];

    os = s;
    while(*s++);
    s[-2] = 'o';
    return(os);
}

callsys(f, v)
char f[], v[][]; {

    extern fork, execv, wait, printf;
    auto t, status;

    if ((t=fork())==0) {
        execv(f, v);
        printf("Can't find %s\n", f);
        exit(1);
    } else
        if (t == -1) {
            printf("Try again\n");
            return(1);
        }
    while(t!=wait(&status));
    if ((t=(status&0377)) != 0) {
        if (t!=9)       /* interrupt */
            printf("Fatal error in %s\n", f);
        dexit();
    }
    return((status>>8) & 0377);
}

copy(s)
char s[]; {
    extern tsp;
    char tsp[], otsp[];

    otsp = tsp;
    while(*tsp++ = *s++);
    return(otsp);
}

nodup(l, s)
char l[][], s[]; {

    char t[], os[], c;

    os = s;
    while(t = *l++) {
        s = os;
        while(c = *s++)
            if (c != *t++) goto ll;
        if (*t++ == '\0') return (0);
ll:;
    }
    return(1);
}

tsp;
tmp0;
tmp1;
tmp2;
tmp3;

It is interesting to note how pragmatic motivation of picking characters to denote language syntax elements based on their use in targeted practical applications resembles Zipf's Law as explained in this terrific answer...

observed relationship between frequency and length is called Zipf's Law

...with the only difference that length in above statement is substituted by / generalized as speed of typing.

gnat
  • 21,442
  • 29
  • 112
  • 288
  • 5
    Anything in support of this "apparent" expectation by the language designers? It doesn't take much programming in C to notice that curly braces are much more common than array declarations. This hasn't really changed much since the olden days -- have a look at K&R. –  Feb 26 '13 at 15:09
  • 1
    I somehow doubt this explanation. We don't know what the expected and they could have easily chosen it the other way around since they were the people to decide about array notation too. We do not even know if they thought curly braces to be the "less important" option, maybe they liked curly braces more. – thorsten müller Feb 26 '13 at 15:09
  • @delnan sure, Zipf's Law – gnat Feb 26 '13 at 15:12
  • I've had to look it up, so I may be missing a core point... but I don't really see how it helps your case. You don't need to convince me that more frequently used constructs should be terser, I agree with that (though the theoretical underpinnings are interesting themselves). What I'm critical of is your assumption that arrays are more common than (everything C uses curly braces for) at all. Or that this is an assumption someone who knows C very well and uses it a lot would consider. –  Feb 26 '13 at 15:16
  • 3
    @gnat: Square braces are easier to type on modern keyboards, does this apply to the keyboards that were around when unix and c were first being implemented? I have no reason to suspect that they were using the same keyboard, or that they would assume that other keyboards would be like their keyboards, or that they would have thought typing speed would be worth optimizing by one character. – Michael Shaw Feb 26 '13 at 15:22
  • @delnan: I thought that the language designers of C were mostly dead. – DeadMG Feb 26 '13 at 15:25
  • 1
    Also, Zipf's law is a generalization on what ends up happening in natural languages. C was artificially constructed, so there is no reason to think it would apply here unless the designers of C consciously decided to deliberately apply it. If it did apply, there's no reason to assume it would simplify something already as short as a single character. – Michael Shaw Feb 26 '13 at 15:28
  • @DeadMG Somebody should ask Kernighan... – Denys Séguret Feb 26 '13 at 15:30
  • @dystroy **Ritchie** speaks pretty clearly about importance of arrays for language designers (see article referenced and quoted in the answer) – gnat Feb 26 '13 at 15:33
  • Important for language design(ers) doesn't mean it is used more frequently than *code blocks*. Please, do you have **anything** actually indicating arrays (not the concept, but only the applications for which C uses the square bracket concept, which excludes dynamic arrays and such) are more frequent than statements grouped together by curly braces? Your entire answer appears to be based on a very questionable assumption (as well as other assumptions which others have pointed out) and I fail to understand all those upvotes. –  Feb 26 '13 at 15:44
  • @delnan upvotes are likely coming from guys who spent several years coding C and don't need stinkin' references to recall how often `[]` is used. Anyway, the answer ([rev 6](http://programmers.stackexchange.com/revisions/188458/6)) has been updated with references reflecting this: "prevalence of pointers in C programs" etc – gnat Feb 26 '13 at 17:40
  • @gnat I ask for references because I know C and have written and read a fair amount of code in it, and the square brackets were significantly less common than curly brackets (at least that's my impression; hence I'd appreciate a study). Subscripting via `ptr[idx]` (a pointer operation rather than an array operation) is more common than array declarations, but still not very common in comparison to all the uses of curly brackets (recall that this includes many kinds of declarations and definitions, as well as any condition, loop, `switch`, etc.) in my experience. –  Feb 26 '13 at 17:53
  • 1
    @gnat FWIW, `grep -Fo` tells me the `*.c` files of the CPython source code (rev. 4b42d7f288c5 because that's what I have at hand), which includes libffi, contains 39511 `{` (39508 `{`, dunno why two braces aren't closed), but only 13718 `[` (13702 `[`). That's counting occurrences in strings and in contexts unrelated to this question, so this is not really accurate, even if we ignore that the code base may not be representative (note that this bias could go in either direction). Still, a factor of 2.8? –  Feb 26 '13 at 18:02
  • @delnan well my exposure to C was more with [code like that](http://www.clear.rice.edu/elec301/Projects01/dope_fft/default6b.html "Code for DFT/FFT Algorithms") where square brackets can win over curly with **[score like 48:8](http://i.stack.imgur.com/Rc8o5.png "count braces at DFTTukey4 function")** – gnat Feb 26 '13 at 18:20
  • @MichaelShaw square brackets were easier to type back then as well (answer expanded with note on IBM 2741 terminal) – gnat Feb 26 '13 at 18:29
  • @delnan consider taking a look at _historically relevant_ code **[references added in rev 9](http://programmers.stackexchange.com/posts/188458/revisions)** - Unix sources, written by Ritchie, Thompson and others in their team in times when C was created. As for CPython reference, its code was written so much later that one would better better drop it because much faster computers allowed 1) to worry less about the cost of function calls and 2) to use more sophisticated compilers capable dropping that cost by inlining functions - creating a substantially different context and coding style – gnat Feb 28 '13 at 13:49
  • @thorstenmüller "what the expected" can be seen from the _Unix Second Edition_ code (answer expanded with explanation and details on that) – gnat Feb 28 '13 at 13:50
  • @MichaelShaw I wouldn't say "C was artificially constructed" because as far as I can tell, it has been designed with strongly pragmatic purpose, that is to simplify porting of specific real working code (Unix V2, answer revised with details on that). Taking this into account it more looks like the language _grew organically_, following the needs of developers using it in a realistic context – gnat Feb 28 '13 at 13:51
  • 1
    @gnat Fair enough, I removed my downvote. –  Feb 28 '13 at 13:57
  • @gnat: I would still say C was artificially constructed, since it was designed. I'm not comparing it to other programming languages (all of which I would consider artificially constructed) but with natural languages, like English and French. If Zipf's law applies here at all, I think it is more likely it applies to curly braces, since they are a significant reduction compared to begin/end. I think square braces were simply how people referred to arrays already anyway, so there was nothing to simplify. – Michael Shaw Feb 28 '13 at 15:06
  • @MichaelShaw your note about square brackets contradicts to what I see in the article referred to in my answer ("The Development of the C Language" by Ritchie). Article says that original notation (in BCPL) was different, `V!i` and that it was changed to `V[i]` only in B language - to me that's not quite "how people referred to arrays already anyway" – gnat Feb 28 '13 at 15:51
  • @gnat: Algol 60 and Algol 68 used square brackets, so did Pascal (which came out about the same time as C), and so (as you note) did B. PL/1 and FORTRAN didn't, using parentheses instead, but they also use parentheses for other things, with obvious problems for humans and compilers reading the code. As you noted, B immediately preceded C and used square brackets; I don't see any contradiction. – Michael Shaw Mar 01 '13 at 01:12
  • On many older teletype/terminal keyboards which didn't use a character-translation ROM, braces were regarded as "lowercase" brackets [often, the capslock key would map characters 0x60-0x7F to 0x40-0x5F; the shift key would map 0x20-0x2F to 0x30-0x3F, 0x30-0x3F to 0x20-0x2F, and 0x60-0x7F to 0x40-0x5F]. – supercat Mar 26 '15 at 15:14
1

C (and subsequently C++ and C#) inherited its bracing style from its predecessor B, which was written by Ken Thompson (with contributions from Dennis Ritchie) in 1969.

This example is from the Users' Reference to B by Ken Thompson (via Wikipedia):

/* The following function will print a non-negative number, n, to
   the base b, where 2<=b<=10,  This routine uses the fact that
   in the ASCII character set, the digits 0 to 9 have sequential
   code values.  */

printn(n,b) {
        extern putchar;
        auto a;

        if(a=n/b) /* assignment, not test for equality */
                printn(a, b); /* recursive */
        putchar(n%b + '0');
}

B itself was again based on BCPL, a language written by Martin Richards in 1966 for the Multics Operating system. B's bracing system used only round braces, modified by additional characters (Print factorials example by Martin Richards, via Wikipedia):

GET "LIBHDR"

LET START() = VALOF $(
        FOR I = 1 TO 5 DO
                WRITEF("%N! = %I4*N", I, FACT(I))
        RESULTIS 0
)$

AND FACT(N) = N = 0 -> 1, N * FACT(N - 1)

The curly braces used in B and subsequent languages "{...}" is an improvement Ken Thompson made over the original compound brace style in BCPL "$(...)$".

8bittree
  • 5,637
  • 3
  • 27
  • 37
ProphetV
  • 691
  • 5
  • 9
  • 1
    No. Seems that Bob Bemer (http://en.wikipedia.org/wiki/Bob_Bemer) is responsible for this - "...you proposed that they could be used in place of the Algol 'begin' and 'end' keywords, which is exactly how they were later used in the C language." (from http://www.bobbemer.com/BRACES.HTM) – SChepurin Feb 26 '13 at 15:49
  • 1
    The `$( ... $)` format is equivalent to `{ ... }` in the lexer in BCPL, just as `??< ... ??>` is equivalent to `{ ... }` in C. The improvement between the two styles is in the keyboard hardware - not the language. –  Feb 26 '13 at 16:30