4

In C you cannot assign arrays directly.

int array[4] = {1, 2, 3, 4};
int array_prime[4] = array; // Error

At first I thought this might because the C facilities were supposed to be implementable with a single or a few instructions and more complicated functionality was offloaded to standard library functions. After all using memcpy() is not that hard. However, one can directly assign structures/unions which can be arbitrarily sized, and can even contain arrays.

union ArrayInt4 {
    int elements[4];
};
union ArrayInt4 union_array = {.elements = {1, 2, 3, 4}};
union ArrayInt4 union_array_prime = union_array; // Works

What is/was the reasoning for allowing structures/unions to be directly assigned but not arrays, if the former can contain arrays anyway? We can only assign arrays directly if they are wrapped in a structure/union, so my aforementioned hypothesis as to why this is the case is out.

user16217248
  • 1,029
  • 1
  • 5
  • 19
  • Because language features are [unimplemented by default](https://coding.abel.nu/2012/02/features-are-unimplemented-by-default/). – Doc Brown May 04 '23 at 05:59

2 Answers2

8

Why does C not support direct array assignment?

It is arguably a shortcoming, due to missing features around arrays, but one the original designers choose not to resolve.  Add in the (over) emphasis on pointers and pointer arithmetic and keeping track of these things yourself as would be done in assembly.

Arrays in C don't really have sizes.  Yes, you can int A[100]; ... sizeof(A) ... for a declared array, but once A is used in an expression, its type is immediately converted to pointer to element (here int*) and the original array's size is lost.  This is particularly evident with parameters where an array is passed.

There's not even a convenient built-in for length (e.g. say, lengthof) that would work for arrays of any element size — instead we have to write something like (sizeof(A)/sizeof(*A)).

A lot of code in C relies on use of pointers where length is kept by the program using other variables or knowledge rather than having the language keep track.

Often, declaring an array in a struct/union as a last member is secret code for variable length.  See for example:

#include <stdio.h>

int A [100];

extern int B [];

struct foo {
    int E, F;
    int C [];
};


int main()
{
    printf("Hello, %ld, %ld, %ld\n", sizeof A, sizeof *B, sizeof (struct foo) );

    return 0;
}

I can't say for sure but apparently resolving the short comings only for declared arrays but not for pointers seemed not all that useful.  So, some larger built in length mechanism would have been required for general use with pointers.

Still, something could have been done, say by having some syntax that allows the user to specify a number of elements — like A = B[0:n] or A[0:n] = B or some other variant — but they simply stopped short of that, leaving it for memcpy and memmove.  Let's also note that when you choose between memcpy and memmove, with the former you're saying that the source and destination do not overlap, whereas with memmove you're saying they might overlap so backwards copying of the elements may be better than forwards (and it will check at runtime).

So, the various problems:

  • pointer oriented rather than array oriented
  • no implicit size information on pointers
  • aliasing issues
  • arrays can be created without even using any array declaration (e.g. using malloc), so by their nature they have a length the program has to keep track of on its own
    • fixed sized arrays are only useful in narrow contexts

I think punting to the standard library was pretty reasonable here, plus as you note for fixed sized arrays, we can wrap them in a struct/union..

Erik Eidt
  • 33,282
  • 5
  • 57
  • 91
8

C originally came from a predecessor language called B, made by Ken Thompson. In B, there were no types. Everything was a "word" (basically, an int).

In B, arrays were just pointers to the first element. If you declared an array:

auto arr[10];

This would allocate 10 words on the stack (to be freed automatically when the function returned, thus auto), and arr would be a pointer to the first one.

When the type system was added, Dennis Ritchie didn't want any existing code to break. This is why, for example, on early versions of C if you omitted the type it would default to int. It's also where pointer decay (array arguments to functions being just pointers) came from.

For this reason, arrays ended up being a second-class citizen in C for a long time. Because of the stability of the language, even the more modern C standards (like C23 which is supposed to come out this year) have to try to fix it without breaking anything, and this particular issue (copying arrays) is not really a priority, because you rarely actually want to copy an array (especially a large one).

Source: The Development of the C language

Min4Builder
  • 316
  • 1
  • 3