Weak references are important in the context of a reference-counting memory-management scheme. For example, we might have a tree data structure where each node knows its parent node. The result is a circular data structure. When we give up all known references to the root of the tree, the tree isn't freed because the references from child nodes to the root nodes prevent the refcount from dropping to zero.
This can be avoided with weak references. When each reference from a child to its parent is a weak reference (a reference that does not affect the reference count), then a reference from one of our variables to the root of the tree is the only counting reference. If we remove that reference, the refcount hits zero (or not, if other references were made), and the memory can be reclaimed.
The implication here is that as long as we are interested in nodes of the tree, we will also keep the root of the tree itself around. Therefore, the weak references to a parent node will always stay valid.
An interesting example of a tree with such properties is the Document Object Model for XML or HTML documents. A node in the DOM cannot exist separately from the document (or document fragment) which it belongs to. The DOM contains accessors that allow an implementation to maintain referential integrity even when implemented with weak references or pointers.
There are a few important observations.
Weak references are largely unnecessary with more sophisticated memory management schemes like garbage collection. For example, we could halt the execution of a program and in the graph of all references find all disconnected subgraphs. All of these disconnected subgraphs except the main one can be collected (although implementing the semantics of destruction can be difficult. Also, this makes the RAII-idiom impossible or difficult).
As pointed out in the comments, some notifier implementations can benefit greatly from (zeroing) weak references even in the context of GC'd languages.
Weak references are basically equivalent to ordinary pointers. You have to structure your program in a way to guarantee that it always points to something valid. While sometimes difficult, experience has shown that this isn't exactly impossible (e.g. by keeping the context around).
An implementation that sets weak references to null
when the referenced object is freed can be highly inefficient, as we need a reference from the referenced object back to the reference itself, so that it can be reset on freeing.
The other option would be to access each object only through a proxy, where weak references only increase the proxy refcount, and normal references increase both the refcount of the proxy and the actual object. When the inner object's refcount is zero, the proxy has to be notified about the destruction to set it's internal pointer to null
. This implies that a weak reference of this design needs deep integration with the runtime's memory management and can't really be added later as a library. This is more memory-efficient, at the expense of adding one additional pointer level to each access.