14

The C11 standard says the arrays, both sized and variable length "shall have a value greater than zero." What is the justification for not allowing a length of 0?

Especially for variable length arrays it makes perfect sense to have a size of zero every once and a while. It is also useful for static arrays when their size is from a macro or build configuration option.

Interestingly GCC (and clang) provide extensions that allow zero length arrays. Java also allows arrays of length zero.

Kevin Cox
  • 251
  • 2
  • 5
  • 7
    http://stackoverflow.com/q/8625572 ... *"An array of zero length would be tricky and confusing to reconcile with the requirement that each object have a unique address."* – Robert Harvey Jul 10 '14 at 22:05
  • 4
    @RobertHarvey: Given `struct { int p[1],q[1]; } foo; int *pp = p+1;`, `pp` would be a legitimate pointer, but `*pp` would not have a unique address. Why could the same logic not hold with a zero-length array? Say that given `int q[0];` *within a structure*, `q` would refer to an address whose validity would be like that of the `p+1` example above. – supercat Jul 10 '14 at 23:35
  • @DocBrown From the C11 standard 6.7.6.2.5 talking about the expression used to determine the size of a VLA "…each time it is evaluated it shall have a value greater than zero." I don't know about C99 (and it seems weird that they would change it) but it sounds like you can't have a length of zero. – Kevin Cox Jul 11 '14 at 19:01
  • @KevinCox: is there a free online version of the C11 standard (or the part in question) available? – Doc Brown Jul 11 '14 at 19:42
  • The final version is not available for free (what a shame) but you can download drafts. The last available draft is http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf. – Kevin Cox Jul 11 '14 at 20:24
  • @supercat It's a legitimate pointer but it doesn't point to any object. – user253751 Jun 06 '18 at 05:11
  • @immibis: I'm not sure what `it` you're referring to. Under the present Standard, an array of length 0 is not allowed to exist, so no legitimate pointer to one could exist. Are you agreeing with me that zero-length arrays should be allowed to exist, and that their addresses should be pointers that are legitimate but don't point to an object of the element type? – supercat Jun 06 '18 at 14:48

6 Answers6

11

The issue I would wager is that C arrays are just pointers to the beginning of an allocated chunk of memory. Having a 0 size would mean that you have a pointer to... nothing? You can't have nothing, so there would have had to be some arbitrary thing chosen. You can't use null, because then your 0 length arrays would look like null pointers. And at that point every different implementation is going to pick different arbitrary behaviors, leading to chaos.

Telastyn
  • 108,850
  • 29
  • 239
  • 365
  • 22
    [**Arrays aren't pointers**](http://c-faq.com/aryptr/). –  Jul 10 '14 at 22:15
  • 9
    @delnan: Well, if you want to be pedantic about it, array and pointer arithmetic is defined such that a pointer can be conveniently used to access an array or to simulate an array. In other words, it's pointer arithmetic and array indexing that are equivalent in C. But the result is the same anyway... if the length of the array is zero, you're still pointing to nothing. – Robert Harvey Jul 10 '14 at 22:20
  • 4
    @RobertHarvey All true, but your closing words (and the whole answer in retrospect) just seems like a confused and confusing way to explain that such an array (I *think* that's what this answer calls "an allocated chunk of memory"?) would have `sizeof` 0, and how that would cause trouble. All that can be explained using the proper concepts and terminology with no loss of brevity or clarity. Mixing up arrays and pointers only risks spreading the arrays = pointers misconception (which is more important in other contexts) for no benefit. –  Jul 10 '14 at 22:32
  • You could also argue that any pointer (other then NULL) is a valid pointer for a zero-length array. – Kevin Cox Jul 10 '14 at 23:07
  • 1
    @KevinCox: Not only that, but a pointer to the element past the end of e.g. an array of `int` would be valid *only* in that sense. – supercat Jul 10 '14 at 23:36
  • 2
    "*You can't use null, because then your 0 length arrays would look like null pointers*" - actually that's exactly what Delphi does. Empty dynarrays and empty longstrings are technically null pointers. – JensG Jul 11 '14 at 00:35
  • @JensG: Because C essentially [defines null as 0](http://c-faq.com/null/macro.html), that won't work in C, since the first element in a C array is indexed at 0. – Robert Harvey Jul 11 '14 at 02:55
  • 1
    If you have any pointer at all and it points to something of nonzero size, that wouldn't make the pointer point at nothing. It would point at the _wrong_ thing because the thing at that location becomes an array of size one. – Blrfl Jul 11 '14 at 03:01
  • @RobertHarvey: Same as in Delphi :-) `nil = pointer(0)` – JensG Jul 11 '14 at 08:19
  • 3
    -1, I am full with @delnan here. This explains nothing, especially in the context of what the OP wrote about some major compilers supporting the concept of zero length arrays. I am pretty sure zero length arrays could be provided in C in an implementation-indepent way, not "leading to chaos". – Doc Brown Jul 11 '14 at 13:14
  • @RobertHarvey: What does the fact that a null-pointer is represented by an integral constant 0 used in a pointer context (or cast to `void*`) have to do with anything here? – Deduplicator Dec 16 '15 at 19:05
  • Why C standard allows `malloc(0)` while disallowing `int[0]`? Both operation result in objects of size 0. – tstanisl Feb 07 '21 at 20:58
6

Let's look at how an array is typically laid out in memory:

         +----+
arr[0] : |    |
         +----+
arr[1] : |    |
         +----+
arr[2] : |    |
         +----+
          ...
         +----+
arr[n] : |    |
         +----+

Note that there isn't a separate object named arr that stores the address of the first element; when an array appears in an expression, C computes the address of the first element as needed.

So, let's think about this: a 0-element array would have no storage set aside for it, meaning there's nothing to compute the array address from (put another way, there's no object mapping for the identifier). It's like saying, "I want to create an int variable that takes up no memory." It's a nonsensical operation.

Edit

Java arrays are completely different animals from C and C++ arrays; they're not a primitive type, but a reference type derived from Object.

Edit2

A point brought up in the comments below - the "greater than 0" constraint only applies to arrays where the size is specified through a constant expression; a VLA is allowed to have a 0 length Declaring a VLA with a 0-valued non-constant expression is not a constraint violation, but it does invoke undefined behavior.

It's clear that VLAs are different animals from regular arrays, and their implementation can allow for a 0 size. They cannot be declared static or at file scope, because the size of such objects must be known before the program starts.

It's also worth nothing that as of C11, implementations are not required to support VLAs.

John Bode
  • 10,826
  • 1
  • 31
  • 43
  • 3
    Sorry, but IMHO you are missing the point, just like Telastyn. Zero length arrays can make a lot of sense, and existing implementations like the ones the OP told us about show that it can be done. – Doc Brown Jul 11 '14 at 15:19
  • @DocBrown: First, I was addressing why the language standard most likely disallows them. Secondly, I would like an example of where a 0-length array makes sense, because I honestly can't think of one. The most likely implementation is to treat `T a[0]` as `T *a`, but then why not just use `T *a`? – John Bode Jul 11 '14 at 15:31
  • Sorry, but I don't buy the "theoretical reasoning" for why the standard forbids this. Read my answer how the address could be actually computed easily. And I suggest you follow the link in Robert Harveys first comment under the question and read the second answer, there is a useful example. – Doc Brown Jul 11 '14 at 15:37
  • @DocBrown: Ah. The `struct` hack. I've never used it personally; never worked on a problem that needed a variably-sized `struct` type. – John Bode Jul 11 '14 at 15:41
  • 2
    And not to forget AFAIK since C99, C allows variable-length arrays. And when the array size is a parameter, not having to treat a value of 0 as a special case can make a lot of programs simpler. – Doc Brown Jul 11 '14 at 15:50
  • @DocBrown: The standard already allows for 0-length VLAs (at least, a 0-length VLA isn't a constraint violation); the "greater than 0" constraint only applies if the array size is a constant expression. I guess the question is, how is `sizeof` supposed to handle a 0-length array that isn't a VLA? – John Bode Jul 11 '14 at 16:14
  • To repost what I said above. The C11 standard does **not** allow VLAs with a length of zero. In item 6.7.6.2.5 when talking about the expression used to determine the size of a VLA it says that "…each time it is evaluated it shall have a value greater than zero." – Kevin Cox Jul 11 '14 at 19:07
  • @KevinCox: it's not a constraint violation, though, so the behavior is undefined. I'll amend my answer. – John Bode Jul 11 '14 at 19:30
  • @JohnBode: after looking the standard again: C99 has the restriction in 6.7.5.2 for VLA not having zero length under "semantics", while in C11 6.7.6.2 almost the same restriction is now described under "constraints". So it became indeed a constraint violation. – Doc Brown Jul 12 '14 at 09:31
  • @DocBrown: Read 6.7.6.2/1 and 6.7.6.2/5 again. It's only a constraint violation if the size is a *constant expression*. – John Bode Jul 14 '14 at 20:14
  • A zero-length array should allocate no storage. And it doesn't matter what its pointer value is, because you aren't allowed to use the pointer. – user253751 Jun 06 '18 at 05:11
  • 1
    Can you clean this answer up a bit? The EDIT monikers and strikeouts make this look like a rough draft. – Robert Harvey Jun 06 '18 at 23:49
2

If the expression type name[count] is written in some function then you tell the C compiler to allocate on the stack frame sizeof(type)*count bytes and compute the address of the first element in the array.

If the expression type name[count] is written outside all functions and structs definitions then you tell the C compiler to allocate on the data segment sizeof(type)*count bytes and compute the address of the first element in the array.

name actually is constant object that stores the address of the first element in the array and every object that stores an address of some memory is called pointer, so this is the reason you treat name as a pointer rather than an array. Note that arrays in C can be accessed only through pointers.

If count is a constant expression that evaluates to zero then you tell the C compiler to allocate zero bytes either on the stack frame or data segment and return the address of the first element in the array, but the problem in doing this is that the first element of zero-length array doesn't exist and you cannot compute the address of something that doesn't exist.

This is rational that element no. count+1 doesn't exist in count-length array, so this is the reason that the C compiler forbids to define zero-length array as variable in and outside of a function, because what is the contents of name then? What address name stores exactly?

If p is a pointer then the expression p[n] is equivalent to *(p + n)

Where the asterisk * in the right expression is dereference operation of pointer, which means access the memory pointed by p + n or access the memory whose address is stored in p + n, where p + n is pointer expression, it takes the address of p and adds to this address the number n multiply the size of the type of the pointer p.

Is it possible to add an address and a number?

Yes it is possible, because address is unsigned integer commonly represented in hexadecimal notation.

user307542
  • 21
  • 1
  • Many compilers used to allow zero-sized array declarations before the Standard forbade it, and many continue to allow such declarations as an extension. Such declarations will cause no problem if one recognizes that an object of size `N` has `N+1` associated addresses, the first `N` of which identify unique bytes and the last `N` of which each point just past one of those bytes. Such a definition would work just fine even in the degenerate case where `N` is 0. – supercat Jun 20 '18 at 22:52
1

If you want a pointer to a memory address, declare one. An array actually points at a chunk of memory you have reserved. Arrays decay to pointers when passed to functions, but if the memory they are pointing at is on the heap, no problem. There is no reason to declare an array of size zero.

ncmathsadist
  • 510
  • 6
  • 8
  • 2
    Generally you wouldn't do this directly but as a result of a macro or when declaring a variable length array with dynamic data. – Kevin Cox Jul 11 '14 at 02:48
  • An array doesn't point, ever. It can contain pointers, and in most contexts you actually use a pointer to the first element, but that's a different story. – Deduplicator Dec 16 '15 at 19:14
  • 1
    The array name IS a constant pointer to the memory contained in the array. – ncmathsadist Dec 17 '15 at 01:57
  • 1
    No, the array name *decays to* a pointer to the first element, in most contexts. The difference is often crucial. – Deduplicator Apr 18 '18 at 19:35
1

You would usually want your zero (in fact variable) size array to know its size at run time. Then pack that in a struct and use flexible array members, like e.g.:

struct my_st {
   unsigned len;
   double flexarray[]; // of size len
};

Obviously the flexible array member has to be the last in its struct and you need to have something before. Often that would be something related to the actual runtime-occupied length of that flexible array member.

Of course you would allocate:

 unsigned len = some_length_computation();
 struct my_st*p = malloc(sizeof(struct my_st)+len*sizeof(double));
 if (!p) { perror("malloc my_st"); exit(EXIT_FAILURE); };
 p->len = len;
 for (unsigned ix=0; ix<len; ix++)
    p->flexarray[ix] = log(3.0+(double)ix);

AFAIK, this was already possible in C99, and it is very useful.

BTW, flexible array members don't exist in C++ (because it would be difficult to define when and how they should be constructed & destructed). See however the future std::dynarray

Basile Starynkevitch
  • 32,434
  • 6
  • 84
  • 125
1

From the days of the original C89, when a C Standard specified that something had Undefined Behavior, what that meant was "Do whatever would make an implementation on a particular target platform most suitable for its intended purpose". The authors of the Standard didn't want to try to guess what behaviors might be most suitable for any particular purpose. Existing C89 implementations with VLA extensions might have had different, but logical, behaviors when given a size of zero (e.g. some might have treated the array as an address expression yielding NULL, while others treating it as an address expression which might equal the address of another arbitrary variable, but could safely have zero added to it without trapping). If any code might have relied upon such different behavior, the authors of the Standard wouldn't want to forbid compilers from continuing to support such code.

Rather than trying to guess what implementations might do, or suggesting that any behavior should be considered superior to any other, the authors of the Standard simply allowed implementers to use judgment in handling that case as best they saw fit. Implementations that use malloc() behind the scenes might treat the array's address as NULL (if size-zero malloc yields null), those that use stack-address computations might yield a pointer which matches some other variable's address, and some other implementations might do other things. I don't think they expected that compiler writers would go out of their way to make the zero-size corner case behave in deliberately-useless fashion.

supercat
  • 8,335
  • 22
  • 28