6

Im wondering if there is a convention or rule about pointer datatype and the stored variable's data type.

If x is an integer which stored in a uC's memory, should the pointer that points x be an int* type pointer?
Would the compiler give error otherwise?

Mike
  • 2,146
  • 1
  • 14
  • 29
user1999
  • 1,465
  • 3
  • 20
  • 35
  • 1
    Yea, you have to type cast the pointer to match with LHS. – Mitu Raj May 19 '21 at 12:56
  • 1
    Highly weird to find this kind of question here instead of Stack Overflow – Ayberk Özgür May 20 '21 at 11:45
  • 1
    @AyberkÖzgür I see your point. I asked here because I wanted to restrict the question to embedded C only. And Stack Overflow folk are very harsh they block you if they don't like your questions if you are especially a beginner. Bunch of arrogant fascists. – user1999 May 20 '21 at 14:01
  • @user1999 Well in that case you should learn how to frame your Q's in the way they are expecting it there. I asked Q's there as a newbie and got on just fine. – dezkev May 20 '21 at 16:46
  • Note that the current highest-voted answer was posted by a user with most reputation at StackOverflow. So those "arrogant fascists", as you call them, do participate here too. – Ruslan May 20 '21 at 22:22

7 Answers7

17

This is a somewhat complex topic. Generally, unless you are a C veteran, then my advise is to never convert a pointer to a different type. Even conversions to/from void pointers are very often questionable.

If we are to restrict the topic to object pointers (and ignore function pointers), then first there's the mentioned void pointers - every object pointer type in C can be implicitly converted to a void pointer and vice versa. That is, the conversion itself is safe, what happens when you de-reference the data is another story.

Other than void pointers, you generally get a compiler error when trying to assign between pointers to different type. C has a much stronger type system for pointers than for say integers.

C also allows all manner of wild pointer conversions by the means of a cast. The conversion itself is almost always fine - what might not be fine is what happens when you de-reference the pointed-at data. The C standard says this (C17 6.3.2.3/7):

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.

So if you have for example a uint8_t* pointer pointing at an aligned address, increase that one by 1 byte, then convert to uint16_t*, you may get a misaligned access. Depending on MCU core used, this may or may not be a problem. Generally, 8 bitter MCUs don't care about alignment. Some 16 bitters do, some don't. Pretty much all 32 bitters do. Also there a CPUs which can give instruction traps for misalignment at the point of conversion, even before de-referencing.

And then if we continue to read the same text quoted above, it continues:

When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

This means that we can use a character type pointer, such as unsigned char*, to inspect individual bytes of a larger object. (uint8_t* almost certainly counts as a character type on any mainstream system.) This is useful for serialization of data, if you for example want to send a 32 bit integer over some serial bus, one byte at a time.

However, we cannot grab a chunk of raw bytes and access that through a pointer to a larger type. There is no special rule like the one above for such scenarios, rather it is something called a "strict aliasing violation" (What is the strict aliasing rule?):

uint8_t array [n] = { ... };
uint16_t* ptr = (uint16_t*)array; // C allows this conversion in itself, but...
*ptr = something; // this is BAD, undefined behavior - a strict aliasing violation

In order to dodge this dangerous part of C, we would rather invent custom union types for "type punning" purposes like the one above:

typedef union
{
  uint8_t  array8  [n];
  uint16_t array16 [n/2]
} array_t;

This allows us to access the data as different types without using dangerous pointer conversions.

Other special exception scenarios do exist, for example we are allowed to convert between a struct pointer and a pointer to that struct's first member. The special rule for this is (C17 6.7.2.1/15):

pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

This is only safe for the first object of the struct! Unless that object is an array (or a union between an array and something else)

And finally, there's the matter of qualified pointers. If you have data that is qualified with const or volatile, then the pointer to that data must use the same qualifier(s). We may never "cast away" qualifiers, doing so is undefined behavior and may result in strange program behavior. It is however always fine to go from a non-qualified pointer to a qualified one.

int* i_ptr;
const int* ci_ptr = ptr; // fine, and no need to cast either
int* another_ptr = ci_ptr; // BAD, undefined behavior

volatile uint8_t some_register;
volatile uint8_t* reg = &some_register; // fine
int* another_ptr = reg; // BAD, undefined behavior

void my_func (const uint8_t* data)
{
  uint8_t* ptr = (uint8_t*)data; // BAD, undefined behavior
}

But here as well, the undefined behavior doesn't occur until you try to de-reference the pointer. The specific rule (C17 6.7.3/6):

If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined. If an attempt is made to refer to an object defined with a volatile-qualified type through use of an lvalue with non-volatile-qualified type, the behavior is undefined.

Lundin
  • 17,577
  • 1
  • 24
  • 67
  • 6
    Even if your code works, doing trickery with changing pointer types tends to be very hard to understand if someone else is looking at your code. Just look at the famous fast inverse square root algorithm and tell me if you can understand how on earth it works. – Hearth May 19 '21 at 15:52
  • 2
    "In order to dodge this dangerous part of C, we would rather invent custom union types for "type punning" purposes like the one above" - type-puning via union is also UB as you cannot read from field you haven't just written to (I'd need to check exact paragraph). If you need type-puning you need to use memcpy. – Maciej Piechotka May 20 '21 at 00:47
  • 5
    @MaciejPiechotka No you are wrong, you are mixing up C and C++. Type punning via union is well-defined as per C17 6.5.2.3/3. Where an informative foot note clarifies: "If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation." – Lundin May 20 '21 at 06:29
  • 4
    @detly DSPs are _not_ mainstream systems. I'm well aware of various dysfunctional, obsolete DSPs mostly from TI that you can barely write C for. _However_ the `uint8_t` is _optional_ (C17 7.20.1.1/3) and only need to be provided if the CPU supports it. Such a legacy DSP will _not_ provide `uint8_t`, so code using it will luckily not even compile if ported to a dysfunctional system. Overall, providing code compatibility to obsolete dysfunctional DSPs is a _huge waste of everyone's time_ and not something anyone without very specialized requirements should bother with. – Lundin May 20 '21 at 06:36
  • 1
    And based on this we can fairly safely assume that if `uint8_t` is available, then it is a character type. I suppose another exotic scenario would be 4 bit MCUs but anyone writing C for those probably get what they deserve... – Lundin May 20 '21 at 06:38
  • 1
    @detly I almost exclusively program embedded systems and DSPs are exotic to me. In the rare event I come across one, the programs are almost always written in assembler. If some incompetent person decides to use some obsolete 16 bit/byte DSP for new development, and at the same time write the programs in C , and at the same time go grab random code off the internet without considering how exotic their obsolete target is... then how is that _my_ problem? The root of all problems in that project is not the imported C code but 1) bad hardware 2) bad developers 3) bad management. – Lundin May 20 '21 at 07:42
  • 1
    @detly Like I already said, the standard states that `uint8_t` will not be available for DSPs with 16 bit/byte. Now do you or anyone else actually use such DSPs? I know that TI got a bunch of them released somewhere back in the 90s. – Lundin May 20 '21 at 08:34
  • 2
    @detly Feel free to add `_Static_assert(_Generic((uint8_t){0},unsigned char:1, default:0), "Magic exotic CPU architecture detected.");`. Now if you can manage to find a conforming C implementation where this code compiles and the assert kicks in, I'd be impressed. The mentioned weird DSPs will not even recognize `uint8_t` so it won't compile on such systems. – Lundin May 20 '21 at 08:46
  • 2
    _"if you have for example a `uint8_t*` pointer pointing at an aligned address, increase that one by 1 byte, then convert to `uint16_t*`, you may get a misaligned access"_ — note that the clause cited before this is about pointer _conversion_, not dereference. So you may get misaligned _pointer_, not even access yet, and already this is UB. If this happens be well-defined for your particular compiler, then you get misaligned access UB at the next stage—dereferencing, and if that also works fine, you still have strict aliasing violation, which is also UB (and might be fixed by a compiler option) – Ruslan May 20 '21 at 22:15
  • @Ruslan Yeah that's true, misaligned pointers may trap on some systems. I'll update the answer. – Lundin May 21 '21 at 06:20
7

In general, yes the pointer to a variable should have the same type as the variable it points to. Basically respect the principle of least suprise, don't use a void pointer where an int pointer will do.

int x = 100;
int * xPtr = &x;

However, sometimes when you have data structures, you can use a pointer to "interpret" the structure as an array. This can be kind of dangerous as the memory alignment of variables is often platform-dependent and compiler-dependent. Edit : It's actually undefined behavior, however you still encounter this kind of code.

typedef struct
{ int x;
  int y;
  int z;
} coordinates_t;

coordinates_t coordinates = {-1, 0, 1};

int * coordinatesPtr = (int *) &coordinates;
coordinatesPtr[2] = 3; // coordinates.z = 3 now

It is also useful if you want to transfer floating-point data on a serial port in a binary format for example. This example will not work on all platforms, so be careful. The format of the bytes will change whether your platform is little-endian or big-endian. Plus, I've seen some weird 24-bit floating-point numbers on PIC18 microcontrollers in the past.

float dummy = 32.3456;
uint8_t * bytePtr = (uint8_t *) &dummy;
rs232Tx(&myUart, bytePtr, 4); // send floating-point number byte-by-byte

Finally, when you want to use some kind of DMA (or memcpy) you usually use void pointers even if it's not always explicit

void * memcpy ( void * destination, const void * source, size_t num );
Ben
  • 645
  • 4
  • 13
  • 3
    `coordinatesPtr[2] = 3;` This is strictly speaking undefined behavior. You are not allowed to do pointer arithmetic on `coordinatesPtr` beyond the first element. See [Is it undefined behaviour to just make a pointer point outside boundaries of an array without dereferencing it?](https://software.codidact.com/posts/277215) You shouldn't teach people do write such code. – Lundin May 19 '21 at 13:25
  • 2
    @Lundin I mentioned that it is platform and compiler dependent and also kind of dangereous. For some platforms you might have to insert dummy data in the structure to make it work. However, I see this kind of code often.. If the OP has to deal with legacy code, he should learn it. – Ben May 19 '21 at 13:33
  • 1
    No it is _not_ platform dependent (aka _unspecified behavior_ / _implementation-defined behavior_) but _undefined_ behavior ([What is undefined behavior and how does it work?](https://software.codidact.com/posts/277486)) meaning it is always a bug and the program might work just fine or break unexpectedly. If you see such code often, you have some major problems in the code bases you are maintaining. – Lundin May 19 '21 at 13:37
  • "This example will not work on all platforms, so be careful." That's also wrong. Converting to `uint8_t*` (a character type) specifically is guaranteed to work on all platforms as a special case - it's 100% portable code. – Lundin May 19 '21 at 13:43
  • I disagree, for example Microcontroller companies use memory-mapped structures similar as the one I posted. For example, https://www.ti.com/lit/an/spraa85e/spraa85e.pdf?ts=1621371875170&ref_url=https%253A%252F%252Fwww.google.com%252F – Ben May 19 '21 at 13:43
  • I meant that the floating-point reinterpretation will not be the same for all platforms, for example if you're little endian or big endian. – Ben May 19 '21 at 13:44
  • Microcontroller vendors are notorious for their utter incompetence when it comes to writing software and applications notes. There's an on-going competition where everyone tries to produce the worst PoS app note or library. TI is one of the market leaders in software incompetence, but the competition is fierce; ST, Microchip, NXP, Silabs and so on, they are all going for the title. – Lundin May 19 '21 at 13:46
  • 1
    (Btw typo here: `(uint8_t) &dummy;` should be `uint8_t*`) – Lundin May 19 '21 at 13:47
  • Well the OP will likely deal with a market leader in software incompetence such as Microchip, TI or STM so he should learn about memory-mapped structure don't you think ? – Ben May 19 '21 at 13:48
  • 1
    No, you shouldn't use such brittle non-standard crap. All of these memory maps are well-known to break when porting between compilers, to the point where you have to pick up another brittle non-standard crap one each time you port. Better then to stick to standard C, to learn standard C and to demand that the software incompetence producers shape up. There's also MISRA-C which is de facto standard nowadays. Vendors that fail to deliver decent memory maps compliant with standard C and MISRA-C will eventually lose market. – Lundin May 19 '21 at 13:52
  • I think one of these vendors managed to make a functioning memory map at one point in history (ST?) though, so it's not all black. I can definitely rule out TI, Atmel/Microchip and NXP. – Lundin May 19 '21 at 13:53
  • Well memory-mapped and peripheral accesses should be wrapped so that porting is easier. – Ben May 19 '21 at 13:56
  • 2
    It's fine btw to have a struct of whatever, in combination with an array in a union. And that's how most such register maps are written. The major problems usually boil down to use of bit-fields and pointer aliasing. – Lundin May 19 '21 at 13:58
  • 3
    Anyway, the point here is that the trend in embedded systems is that low quality crapilers are getting switched out for high quality compilers, most notably gcc. And gcc is very strict with C compliance, if your code contains undefined behavior the compiler _will_ haunt your code with strange bugs, especially when you turn optimizations on. – Lundin May 19 '21 at 13:58
  • @Lundin: You mean reliable compilers are being switched out for unreliable ones like gcc, which at least on low-end ARM targets invest much more effort into finding "clever" optimizations than processing straightforward constructs efficiently? – supercat May 19 '21 at 22:41
  • 1
    @supercat Well not really. While some old embedded system compilers may not be prone to dangerous optimizations, they aren't prone to _any_ optimizations. They usually stopped developing their tool somewhere in the mid-1990s. Add poor language conformance and generous amounts of compiler bugs to that. – Lundin May 20 '21 at 06:27
  • @Lundin: The professional-level compilers I've used generated reliable code, and the ones that targeted machines with decent-sized register sets did a pretty good job of effective but safe optimizations involving automatic objects whose address isn't taken. I'll admit that clang and gcc has largely killed the market for quality C compilers, but that doesn't mean it's a quality compiler. I don't know all the factors that affect x64 performance, so I can't judge gcc's code quality there, but its code generation on Cortex-M0 and Cortex-M3 leaves me unimpressed. – supercat May 20 '21 at 07:01
  • @Lundin: As for "standards conformance", I'd rather have a compiler that does a reliable job processing code written in the language C89 was commissioned to describe, than one which supports useless aspects of C99 such as variable-length arrays while refusing to acknowledge that the Standard was not intended to preclude the use of C as a "high-level assembler", nor imply that implementations should be considered suitable for low-level programming without supporting constructs beyond those mandated. – supercat May 20 '21 at 07:17
4

I would say yes, especially if you are not sure.

If you want something that can switch types, you can use a union for that in C, so the pointers will be of the union's type, and any dereferencing will be done inside probably a switch statement that manages the dynamically typed access (and then it will be the member's type, so everything will line up).

There are some libc functions that require use of void*, like memcpy() (mentioned as mentioned in another answer), or malloc(), which I'd say should be minimized or in some cases completely eliminated in embedded. Other than that, using a lot of void* is a sign you might be reinventing the wheel.

Pete W
  • 1,207
  • 1
  • 5
  • 12
1

Pointers should "always" be the pointer data type. This means your pointer decl should be:

  • thing* name;
    • For when you always want to point to a particular data type
  • void* name;
    • For when you want to point to an unknown/incomplete data type
  • intptr_t/uintptr_t/ptrdiff_t name;
    • For when you want to do pointer math.

The compiler may or may not throw errors for other declarations depending on the type and the compiler flags. C compilers do not always check you.

pgvoorhees
  • 2,496
  • 16
  • 14
1

Pointers are mostly a synonym for "memory address". As such, most can be treated as an integer with the width of the CPU bitness, but that is not recommended even in C, which is less strict than other languages like C++. With that in mind, you should avoid casting in unexpected ways unless you know exactly what you're doing.

Most "compiler nazis" will yell right away that this is undefined behavior and that's a BAD THING, and they will be mostly right. What this means is that what really happens depends a lot on the compiler implementation and will most likely vary between compilers.

Hence, if your compiler allows it, the generated machine code is mostly a happenstance (except for specific "defined" cases like malloc). Specially important is that different optimization levels will likely also generate different machine code and can break what was considered working code in another level. This also means that your code is not portable and may break even in a point update of the compiler. This leads to general discouragement in its usage.

That said, apart from memory allocation, typecasting is mostly used when (de)serializing data, for example when transmitting data over a serial line. Of special note is the transmission of float/double data, which sometimes can only be done with a pointer typecast or a union. Both of those are also considered undefined behavior, but with special care it may be more easily managed if the serialization interface is well-defined. Also, keep in mind that even this can be "reinventing the wheel" and quite a challenge, since it can lead to unexpected troubles, like different endianness between transmitter and receiver, which will require special care.

Typecasting is also "useful" in hacking, and doubly so for an attacker. The attacker may use it to write smaller code that fits a limited payload, without having to go all the way to Assembly language. And if the developer of the target used it for dealing with data, (s)he may have introduced subtle vulnerabilities, which is another reason not to use it.

That said, pointer casting is also a way of implementing polymorphism in C, specially when dealing with function pointers. The same caveat applies: only use it if you know what you're doing.

So, in one end of the spectrum, you have malloc and others, which require the typecasting of void* pointers, even though dynamic memory management is less common in microcontrollers. On the other end, there is the example of pointer increment ptr++, which is pretty common and will vary by the size of the datatype, which may cause misalignment and ultimately a hardfault in some CPU/MCU architectures. Another example is when dealing with a packed struct (which is also a compiler-dependent detail) the compiler may generate a sequence of instructions to make it look like you're dealing with some data type. However, the MCU itself may not have facilities to deal with, for example, odd or even addresses. This actually happens in some 16-bit and 32-bit architectures, specially when dealing with the instruction data, and when the hardware finds a "wrong" sequence it may hardfault.

Ronan Paixão
  • 1,015
  • 6
  • 7
  • "you have malloc and others, which require the typecasting of void* pointers" Huh? Where did you get that idea from? – Lundin May 20 '21 at 06:41
  • @Lundin malloc returns "void*". It doesn't know what type you wanted - it simply allocates the number of bytes you asked for and gives you the address of the start. In order to use that memory for your data, you then need to cast that "void*" to "float*" or whatever type you're going to use that memory for. (OK, technically there are other ways, but in practise that's what you'd normally do.) – Graham May 20 '21 at 12:50
  • @Graham No, that's plain wrong. C allows implicit conversions from void pointers to any other pointer type. The cast just adds pointless clutter and was even dangerous & harmful once upon a time prior C99. You might be thinking of C++ which works differently. – Lundin May 20 '21 at 12:53
  • @Lundin That's incorrect. Implicit typecasting is still typecasting. You don't have to make the cast explicit, sure - but the cast still has to take place. "float* ptr = malloc(sizeof(float));" is just casting implicitly rather than explicitly. You'll also find style guides and linters may consider implicit casting of pointers to be a code smell, because it's a common cause of errors, and they consider explicit casts to *not* be "pointless clutter" nor "dangerous and harmful". YMMV on that of course, as with any style guide. – Graham May 20 '21 at 13:24
  • @Graham There's nothing called "implicit typecasting". In the C language there are implicit or explicit _conversions_, where a cast is always an explicit conversion done by the programmer, using the cast operator. As for casting the result of malloc, that's a [beating the dead horse debate](https://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc). I'm not interested in beating that dead horse; my only argument here is that you certainly don't _have_ to cast the result of malloc in C, unlike C++ which does not allow implicit conversions to/from void pointers. – Lundin May 20 '21 at 13:33
  • 1
    @Graham Specifically, implicit conversions from void pointers to object pointers and back are allowed as per the rules of simple assignment C17 6.5.16.1 "one operand is a pointer to an object type, and the other is a pointer to a qualified or unqualified version of void". The cast operator refers to that too, in C17 6.5.4/3 (constraints): "Conversions that involve pointers, other than where permitted by the constraints of 6.5.16.1, shall be specified by means of an explicit cast." – Lundin May 20 '21 at 13:40
1

In order for an implementation to be suitable for embedded programming on most platforms, it must specify how it will process some constructs where the Standard would impose no requirements. The authors of the C Standard expressly allowed for the possibility that an implementation might process such constructs "in a documented manner characteristic of the environment", though it left the question of when to do so as a quality of implementation issue outside its jurisdiction.

In general, it makes sense to access stored values with pointers of their type, but there are times when it may be useful or necessary to do otherwise. Nearly all C implementations (and probably all C implementations that would be suitable for embedded programming) can be configured to support this.

For some reason, some compiler writers have latched onto the notion that when the Standard describes a construct as "non-portable or erroneous", that doesn't mean "non-portable (but possibly correct), or erroneous (making portability irrelevant)" but rather "non-portable, and therefore erroneous". This expressly contradicts the intention of the C Standards Committee as described in their published Rationale document, which specifies Undefined Behavior as, among other things, identifying avenues for "conforming language extension".

While some people would suggest that programmers jump through hoops to fit the way clang and gcc interpret the Standard when invoked without the -fno-strict-aliasing flag, I'd regard that as a fools errand. The authors of the Standard expected that any compiler which sought to be maximally useful and could handle all of the corner cases mandated thereby would also handle many other useful corner cases, making it unnecessary to enumerate all of the latter. Given that clang and gcc fail to handle all of the cases mandated by the Standard except when using -fno-strict-aliasing, it's possible that they're right.

Consider the following code:

#include <stdint.h>
#include <limits.h>

// For demonstration purposes, need a type other than long which has
// the same size.
#if LONG_MAX > INT_MAX
typedef long long longish;
#else
typedef int longish;
#endif

// Store the indicated value using either type `long` or `longish`, as
// indicated by the mode argument.

void store_long_or_longish(void *p, long value, int mode)
{
    if (mode)
        *(long*)p = value;
    else
        *(longish*)p = value;
}

// Function to demonstrate gcc's brilliance.  When targeting either x64 or
// 32-bit ARM, gcc will generate code that returns 1 without regard for
// whether *(long*)q might have been written by store_long_or_longish using
// a mode value of 1.

long test(void *p, void *q, int mode)
{
    *(long*)q = 1;
    store_long_or_longish(p, 2, mode);
    return *(long*)q;
}

If test is called with p==q and mode==1, then it should store a 1 into *(long*)q, then store_long_or_longish should store 2 into the same address, using the same type, and then test should load and return that stored value. As processed by gcc, however, the mere presence of the store that would use *(longish*), even though that store is never actually executed, would cause gcc to conclude that the call to store_long_or_longish can't possibly modify *(long*)p.

supercat
  • 45,939
  • 2
  • 84
  • 143
  • 1
    Your continuous rants against the strict aliasing rules are getting tiresome. I fully agree that the rules of effective type/strict aliasing in the standard from C99 and beyond are broken and a language defect. But this isn't something that beginners or average C programmers on an EE forum can fix. Telling them to use `-fno_strict_aliasing` whenever gcc is used in embedded systems is good advice. But the people reading this will not likely be able to fix the standard nor gcc. You should direct this to the standard committee and gcc maintainers in the form of Defect Reports and bug reports. – Lundin May 20 '21 at 08:30
  • @Lundin: There will be no possible way of fixing the main Standard that would not be opposed by enough people to block a consensus. The biggest weakness in C89, retrospectively, is that in order to avoid implying that some implementations were "inferior", it avoided describing or even overly strongly alluding to things that quality implementations should be expected to do when practical, but need not do if obviously impractical. Some compiler writers have taken this as an invitation to behave in ways that would have been universally recognized as inferior when the Standard was written. – supercat May 20 '21 at 15:08
  • @Lundin: I think the intention of the Standard has always been that non-diagnostic implementations behave as described in N1570 5.1.2.3, with "a one-to-one correspondence between abstract and actual semantics", in all cases that matter, but avoid requiring such correspondence in cases that don't matter. It also recognized that compiler writers would be better placed than the Committee to judge when that correspondence would matter to their various customers, and was never intended to make programmers jump through hoops to prevent erroneous assumptions about what does or does not matter. – supercat May 20 '21 at 15:46
  • @Lundin: I am curious about your thoughts on something: in situations where the Standard unambiguously defines the behavior of a program, but both clang and gcc process it in the same nonsensical fashion, does that conflict imply an actual defect in the Standard, a bug in both clang and gcc, or an intention by clang and gcc to disregard parts of the C Standard which they view as defective even if they are unambiguous? What is the best way to ensure that programmers understand the language clang and gcc actually seek to process when various settings are used? – supercat May 20 '21 at 16:28
1

If a pointer points to some object in memory, that means we are talking about run-time. The C compiler no longer exists at run-time. A pointer of any type can point to anything of any type, or be completely invalid.

The C language has a compile-time type system with rules, which requires diagnostics for some situations. It is easy to subvert the type system.

If a variable x is declared as int, and its address is taken using &x, that expression has type int * (pointer to int). That pointer is incompatible with most other pointer types. For instance if you pass it as function that is prototyped as taking char * (pointer to char) argument, there will be a diagnostic like `warning: converting int * to char * without a cast".

You don't have a say in that the type of &x is int *, because that is a hard rule of the type system; but you can forcefully convert that value to another type with a cast: (char *) &x. That is permitted, and it could make a lot of sense in the right situation. This value can then be assigned to a char * typed variable, or passed to a function that takes a char * argument.

In that situation, you then have a char * pointer aimed at an int object at run-time. The situation is not necessarily wrong, and there is no compiler diagnostic.

Kaz
  • 19,838
  • 1
  • 39
  • 82