44

I want to learn about null values or null references.

For example I have a class called Apple and I created an instance of it.

Apple myApple = new Apple("yummy"); // The data is stored in memory

Then I ate that apple and now it needs to be null, so I set it as null.

myApple = null;

After this call, I forgot that I ate it and now want to check.

bool isEaten = (myApple == null);

With this call, where is myApple referencing? Is null a special pointer value? If so, if I have 1000 null objects, do they occupy 1000 object memory space or 1000 int memory space if we think a pointer type as int?

Mat
  • 2,066
  • 2
  • 26
  • 31
Mert Akcakaya
  • 2,089
  • 1
  • 13
  • 16
  • 1
    You have to distinguish between the variable and the object. A variable is a memory slot containing a value. If the value is a reference to an object, the object is stored separately, and the variable only contains a pointer to it. If the pointer is NULL; then the variable slot still takes up the same amount of space, but there is no object allocated. So null is a special pointer value that does not point to anything. – JacquesB May 03 '22 at 06:34

6 Answers6

49

In your example myApple has the special value null (typically all zero bits), and so is referencing nothing. The object that it originally referred to is now lost on the heap. There is no way to retrieve its location. This is known as a memory leak on systems without garbage collection.

If you originally set 1000 references to null, then you have space for just 1000 references, typically 1000 * 4 bytes (on a 32-bit system, twice that on 64). If those 1000 references originally pointed to real objects, then you allocated 1000 times the size of each object, plus space for the 1000 references.

In some languages (like C and C++), pointers always point to something, even when "uninitialized". The issue is whether the address they hold is legal for your program to access. The special address zero (aka null) is deliberately not mapped into your address space, so a segmentation fault is generated by the memory management unit (MMU) when it is accessed and your program crashes. But since address zero is deliberately not mapped in, it becomes an ideal value to use to indicate that a pointer is not pointing to anything, hence its role as null. To complete the story, as you allocate memory with new or malloc(), the operating system configures the MMU to map pages of RAM into your address space and they become usable. There are still typically vast ranges of address space that are not mapped in, and so lead to segmentation faults, too.

Randall Cook
  • 2,470
  • 19
  • 19
  • Very good explanation. – NoChance May 08 '12 at 09:23
  • 9
    It's slightly wrong on the "memory leak" part. It's a memory leak in systems without automatic memory management. However, GC isn't the only possible way to implemement automatic memory management. C++'s `std::shared_ptr` is an example that's neither GC nor leaks the `Apple` when zeroed. – MSalters May 08 '12 at 11:21
  • 1
    @MSalters - Isn't `shared_ptr` just a basic form for garbage collection? GC doesn't require that there be a separate "garbage collector", only that garbage collection occurs. – Brendan Long May 08 '12 at 16:16
  • 7
    @Brendan: The term "garbage collection" is almost universally understood to refer to non-deterministic collection that takes place independent of the normal code path. Deterministic destruction based on reference counting is something completely different. – Mason Wheeler May 08 '12 at 17:12
  • 2
    Good explanation. One slightly misleading point is the assumption that memory allocation maps to RAM. RAM is one mechanism for short-term memory storage, but the actual storage mechanism is abstracted by the OS. In Windows (for non-ring-zero apps) the memory pages are virtualized and may map to RAM, disk swap file, or perhaps another storage device. – Simon Gillbee Jul 22 '14 at 14:09
  • 1
    Well, if you consider a null-pointer points to something, that's equally valid in any language... – Deduplicator Jan 20 '16 at 05:07
  • This still doesn't answer the question of WHERE in memory null is. Stack, heap, metaspace? I need a picture here. Is `myApple` in the Stack set to zero? – FLonLon Sep 30 '20 at 08:24
  • Naturally, on a GC system, nulling out pointers is (potentially) freeing the pointed to object, avoiding a memory leak. As an aside, avoid conflating null and zero. While the former is often mapped to the latter (and to further confuse matters in C and C++ the latter is often represented by the former in source-code), they are inherently distinct concepts. – Deduplicator Apr 29 '22 at 18:33
  • @FLonLon When you declare myAppleCount to be an integer and set it to '3', where is the '3'? – DaveG Apr 29 '22 at 20:46
12

The answer depends on the language you're using.

C/C++

In C and C++, the keyword was NULL, and what NULL really was was 0. It was decided that "0x0000" was never going to be a valid pointer to an object, and so that is the value which gets assigned to indicate that it is not a valid pointer. However, it's completely arbitrary. If you attempted to access it like a pointer, it would behave exactly like a pointer to an object which no longer exists in memory, causing a invalid pointer exception to be thrown. The pointer itself occupies memory, but no more than an integer object would. Hence, if you have 1000 null pointers, it is the equivalent of 1000 integers. If some of those pointers point to valid objects, then the usage of memory would be the equivalent of 1000 integers plus the memory contained in those valid pointers. Remember that in C or C++, if a pointer no longer points to its object, that does not imply memory has been released, so you must explicitly delete that object using dealloc (C) or delete (C++).

Java

Unlike in C and C++, in Java null is merely a keyword. Rather than managing null like a pointer to an object, it is managed internally and treated like a literal. This eliminated the need to tie in pointers as integer types and allows Java to abstract away pointers entirely. However even if Java hides it better, they are still pointers, meaning 1000 null pointers still consume the equivalent of 1000 integers. Obviously when they point to objects, much like C and C++, memory is consumed by those objects until no more pointers reference them, however unlike in C and C++, the garbage collector picks up on it on its next pass and frees up the memory, without requiring that you have to keep track of what objects are freed up and which objects are not, in most cases (unless you have reasons to weakly reference objects for example).

Neil
  • 22,670
  • 45
  • 76
  • 12
    Your distinction isn’t correct: in fact, in C and C++, the null pointer doesn’t need to point to the memory address 0 at all (although this is the natural implementation, same as in Java and C#). It can point literally anywhere. This is slightly confounded by the fact that literal-0 can be implicitly converted to a null pointer. But the bit pattern stored for a null pointer still need not be all zeros. – Konrad Rudolph May 08 '12 at 11:52
  • @KonradRudolph, true, NULL was supposed to be an abstraction of sorts, however if NULL ever happened to correspond with an actual valid pointer, you'd end up testing if a pointer was NULL and having it return true! NULL would have to be another invalid pointer value, if it weren't NULL. And when you consider that a non-zero value would allow you to enter clauses like `if(NULL)`, anything other than zero seems rather non-intuitive. – Neil May 08 '12 at 12:28
  • 4
    No, you are wrong. The semantics are completely transparent … in the program, null pointers and the macro `NULL` (*not* a keyword, by the way) are treated as if they were zero-bits. But they don’t need to be implemented as such, and in fact some obscure implementations *do* use non-zero null pointers. If I write `if (myptr == 0)` then the compiler will do the correct thing, even if the null pointer is internally represented by `0xabcdef`. – Konrad Rudolph May 08 '12 at 12:53
  • @KonradRudolph, NULL is simply a macro for "0". Look it up. If you defined NULL to be 0xabcdef and then called `if (myptr == 0)`, the compiler will *not* do the right thing as you claim, seeing how 0 is definitely not the same thing as `0xabcdef`. However, please enlighten me if that's not the case. – Neil May 08 '12 at 13:04
  • 5
    @Neil: a _null pointer constant_ (prvalue of integer type that evaluates to zero) is convertible to a _null pointer value_. (§4.10 C++11.) A null pointer value is not guaranteed to have all bits zero. `0` is a null pointer constant, but this doesn't mean that `myptr == 0` checks if all the bits of `myptr` are zero. – Mat May 08 '12 at 13:43
  • @Mat I welcome an example from either of you if you can prove otherwise. – Neil May 08 '12 at 14:08
  • 8
    @Neil: You might want to check [this entry](http://c-faq.com/null/machexamp.html) in the C faq or [this](http://stackoverflow.com/questions/2759845/why-is-address-zero-used-for-null-pointer) SO question – hugomg May 08 '12 at 14:21
  • 1
    @Neil That’s why I took pains not to mention the `NULL` macro at all, rather talking about the “null pointer”, and explicitly mentioning that “literal-0 can be implicitly converted to a null pointer”. – Konrad Rudolph May 08 '12 at 14:32
  • Can I not say that NULL is almost always 0? I explicitly stated that it was entirely arbitrary, implying it could be another number as well. Can I not say it is almost always 0 without writing a half a page article on why it may not be 0 or for the sake of simplifying things say that it's 0 for all intents and purposes? The whole point is that it is not a valid pointer which I addressed in my answer. – Neil May 08 '12 at 14:55
  • Say hello to `nullptr`. In addition, in C++, it's *never* in good code that you would explicitly deallocate anything. – DeadMG May 08 '12 at 16:10
  • ((void*)0) is indeed almost always stored as 0; I don't personally know of a platform where it is not stored as 0. I imagine it would make sense on systems with very tiny memory sizes, where you wouldn't want to waste a byte or two by not storing anything at location 0... in that case ((void*)0) might use an all-1s bit pattern instead. But then, I wonder what happens if you convert a pointer to an integer. Would casting support round-tripping? How would that work? – Qwertie May 08 '12 at 16:14
  • @Qwertie: If you convert a pointer to an integer, you're a moron, so it's not really necessary to worry about what happens. – DeadMG May 08 '12 at 16:32
  • @Qwertie: Thats very C like. C++ its simply `0`. Casting arbitrary numbers to pointers is not a good idea. – Martin York May 08 '12 at 18:13
  • 2
    @Neil The relevant point is simply that there is no different treatment of this in Java on the one hand, and C and C++ on the other. None at all. There are other differences, but not in their implementation of null pointers. (And just to clarify, I’m talking about the *implementation* here, not the syntax, and also not about C’s lax type system …). The point is not that there was an inaccuracy in your post that would be clarified if you went into a lot of detail. The post is simply wrong. – Konrad Rudolph May 08 '12 at 20:28
  • @KonradRudolph, if we're talking about what the compiler is doing, I imagine that they're the same between C/C++ and Java, but do I really care how the compiler is doing it? I was simply emphasizing that in C/C++, it might as well be an integer for how it's treated. Heck, most of the MFC framework parameter passing system requires that you cast to LPARAM / WPARAM which are integers, not pointers. Unlike in Java, where I think it would be very wrong to know the value of a pointer. Java treats it like the object itself rather than a pointer to an object. – Neil May 09 '12 at 07:18
  • 2
    @Neil But this isn’t what your answer is saying at all. – Konrad Rudolph May 09 '12 at 10:43
7

Quick example (note varible names are not stored):

void main()
{
  int X = 3;
  int *Y = X;
  int *Z = null;
} // void main(...)


...........................
....+-----+--------+.......
....| &X  |   X    |.......
....+-----+--------+.......
....| 100 |   3    |<---+..
....+-----+--------+....|..
........................|..
....+-----+--------+....|..
....| &Y  |   Y    |....|..
....+-----+--------+....|..
....| 102 |  100   +----+..
....+-----+--------+.......
...........................
....+-----+--------+.......
....| &Z  |   z    |.......
....+-----+--------+.......
....| 104 |   0    |.......
....+-----+--------+.......
...........................

Cheers.

umlcat
  • 2,146
  • 11
  • 16
6

A pointer is simply a variable which is mostly of an integer type. It specifies a memory address where the actual object is stored.

Most languages allow to access object members via this pointer variable:

int localInt = myApple.appleInt;

The compiler knows how to access the members of an Apple. It "follows" the pointer to myApple's address and retrieves the value of the appleInt

If you assign the null pointer to a pointer variable, you make the pointer point to no memory address. (Which makes member access impossible.)

For every pointer you need memory to hold the memory address integer value (mostly 4 Bytes on 32 bit systems, 8 bytes on 64 bit systems). This is also true for null pointers.

Stephan
  • 161
  • 4
  • I think the reference variables/objects aren't exactly pointers . If you print them they contain ClassName@Hashcode . JVM internally uses Hashtable to store Hashcode with actual address and uses a Hash Algorithm to retrieve the actual address when necessary . – minusSeven May 08 '12 at 08:18
  • @minusSeven That's correct for what concerns literal objects like integers. Otherwise the hashtable holds pointers to other objects contained within the Apple class itself. – Neil May 08 '12 at 08:27
  • @minusSeven: I agree. The details of pointer implementation depend heavily on the language/runtime. But I think those details are not that relevant for the specific question. – Stephan May 08 '12 at 08:29
0

In C, C++, and other languages, you have pointers to objects. Your example is wrong at this point; it shouldn’t be Apple myApple; but Apple* myApple;. The new operator creates a Apple object somewhere and stores a pointer to that object into myApple.

You can take that pointer into another pointer variable. If you do that, both point to the same Apple object. You can set the myApple pointer to a null pointer value as you did, in that case it doesn’t point anywhere anymore. This is problematic if you removed the only pointer to your Apple object that you created, because you cannot delete the object anymore and have a memory leak. You don’t have null objects. You only have null pointers which don’t point anywhere.

If you had writtenApple myApple = Apple(…); then you would have an instance of class Apple in your myApple variable. It’s destructor gets called when it goes out of scope, and then you have no null value, you have nothing. The variable is gone.

And in C++, you cannot have null references. References must always refer to something.

gnasher729
  • 42,090
  • 4
  • 59
  • 119
0

Swift is reasonably close to having null values.

For every type T, the Syntax optional <T> is an enum with two cases: “none” with no value, and “some” with a value of type T. There is a lot of syntactic sugar, mostly implemented in the standard library. For example you can write x == nil, and you can assign x = nil or x = y, where y is of type T or optional<T>; each change the enum appropriately.

A nil value is an optional with case = none. What is stored, whether nil or not, is the enum. The space for an enum is fixed. In the nil case, the data in the space used for the “some” case is just ignored (similar to a C or C++ union).

gnasher729
  • 42,090
  • 4
  • 59
  • 119