52

Java has an automatic GC that once in a while Stops The World, but takes care of garbage on a heap. Now C/C++ applications don't have these STW freezes, their memory usage doesn't grow infinitely either. How is this behavior achieved? How are the dead objects taken care of?

Robert Harvey
  • 198,589
  • 55
  • 464
  • 673
Ju Shua
  • 713
  • 1
  • 5
  • 8
  • 38
    Note: stop-the-world is an implementation choice of some garbage collectors, but certainly not all. There are concurrent GCs, for example, which run concurrently with the mutator (that's what GC developers call the actual program). I believe you can buy a commercial version of IBM's open source JVM J9 that has a concurrent pauseless collector. Azul Zing has a "pauseless" collector that isn't *actually* pauseless but extremely fast so that there are no *noticeable* pauses (its GC pauses are on the same order as an operating system thread context switch, which is usually not seen as a pause). – Jörg W Mittag Jun 16 '16 at 16:21
  • 14
    Most of the (long-running) C++ programs I use *do* have memory usage that grows unboundedly over time. Is it possible you're not in the habit of leaving programs open for more than a few days at a time? – Jonathan Cast Jun 16 '16 at 16:54
  • 12
    Take into consideration that with modern C++ and its constructs you no longer need to delete memory manually either (unless you are after some special optimization), because you can manage dynamic memory through smart pointers. Obviously, it adds some overhead to C++ development and you need to be a little bit more careful, but it's not an entirely different thing, you just need to remember to use the smart pointer construct instead of just calling manual `new`. – Andy Jun 16 '16 at 17:08
  • 2
    Also, be aware of [memory fragmentation for dynamically allocated memory](http://stackoverflow.com/a/3770593/128967). – Naftuli Kay Jun 17 '16 at 02:34
  • 9
    Note that it is still possible to have memory leaks in a garbage-collected language. I'm unfamiliar with Java, but memory leaks are unfortunately quite common in the managed, GC world of .NET. Objects that are indirectly referenced by a static field are not automatically collected, event handlers are a very common source of leaks, and the non-deterministic nature of garbage collection makes it unable to completely obviate the need to manually free resources (leading to the IDisposable pattern). All said, the C++ memory management model used properly is *far* superior to garbage collection. – Cody Gray - on strike Jun 17 '16 at 06:41
  • 2
    @CodyGray It's the same in Java and any other managed language as well - it's not something that can be avoided. You said "hang on to this object for me, thanks". What is the runtime supposed to do, say "Well, you haven't accessed this in quite some time, let me release this for you"? :D – Luaan Jun 17 '16 at 07:54
  • 27
    `What happens to garbage in C++?` Isn't it usually compiled into an executable? – BJ Myers Jun 18 '16 at 19:18
  • 1
    @BJMyers I made a similar joke earlier but some C++ user deleted my comment :P – cat Jun 19 '16 at 15:24
  • @BJMyers: it's not even literally wrong – C++ can often “recycle” garbage with move semantics or by simply overwriting variables on the stack. So in a sense, a significant portion of an executable generated with a compiler written in C++ will indeed consist of “garbage memory”. Of course, garbage collectors may also reuse memory – that's basically the whole point of collecting garbage – but it works much more indirectly. – leftaroundabout Jun 19 '16 at 21:53
  • 1
    @CodyGray: [`IDisposable`](https://msdn.microsoft.com/en-us/library/system.idisposable(v=vs.110).aspx) exists primarily to release *unmanaged* resources. – Robert Harvey Jun 28 '16 at 21:07
  • 1
    Indeed, @Robert. That is the point. Because garbage collection is non-deterministic, it cannot be used to release unmanaged resources, so you need a cumbersome workaround. `IDisposable` is that workaround. And if you do not properly follow the pattern, vigilantly maintaining a distinction between managed/unmanaged resources and verifying that you have called `Dispose` on every object that implements `IDisposable`, you are virtually guaranteed to have memory leaks. (Sometimes even for managed objects/memory!) RAII/RRID in C++ is a complete and more effective solution to that problem. – Cody Gray - on strike Jun 29 '16 at 14:42

8 Answers8

103

The programmer is responsible for ensuring that objects they created via new are deleted via delete. If an object is created, but not destroyed before the last pointer or reference to it goes out of scope, it falls through the cracks and becomes a Memory Leak.

Unfortunately for C, C++ and other languages which do not include a GC, this simply piles up over time. It can cause an application or the system to run out of memory and be unable to allocate new blocks of memory. At this point, the user must resort to ending the application so that the Operating System can reclaim that used memory.

As far as mitigating this problem, there are several things that make a programmer's life much easier. These are primarily supported by the nature of scope.

int main()
{
    int* variableThatIsAPointer = new int;
    int variableInt = 0;

    delete variableThatIsAPointer;
}

Here, we created two variables. They exist in Block Scope, as defined by the {} curly braces. When execution moves out of this scope, these objects will be automatically deleted. In this case, variableThatIsAPointer, as its name implies, is a pointer to an object in memory. When it goes out of scope, the pointer is deleted, but the object it points to remains. Here, we delete this object before it goes out of scope to ensure that there is no memory leak. However we could have also passed this pointer elsewhere and expected it to be deleted later on.

This nature of scope extends to classes:

class Foo
{
public:
    int bar; // Will be deleted when Foo is deleted
    int* otherBar; // Still need to call delete
}

Here, the same principle applies. We don't have to worry about bar when Foo is deleted. However for otherBar, only the pointer is deleted. If otherBar is the only valid pointer to whatever object it points to, we should probably delete it in Foo's destructor. This is the driving concept behind RAII

resource allocation (acquisition) is done during object creation (specifically initialization), by the constructor, while resource deallocation (release) is done during object destruction (specifically finalization), by the destructor. Thus the resource is guaranteed to be held between when initialization finishes and finalization starts (holding the resources is a class invariant), and to be held only when the object is alive. Thus if there are no object leaks, there are no resource leaks.

RAII is also the typical driving force behind Smart Pointers. In the C++ Standard Library, these are std::shared_ptr, std::unique_ptr, and std::weak_ptr; although I have seen and used other shared_ptr/weak_ptr implementations that follow the same concepts. For these, a reference counter tracks how many pointers there are to a given object, and automatically deletes the object once there are no more references to it.

Beyond that, it all comes down to proper practices and discipline for a programmer to ensure that their code handles objects properly.

jotik
  • 105
  • 6
Thebluefish
  • 676
  • 1
  • 5
  • 11
  • 4
    deleted via `delete` - thats what I was looking for. Awesome. – Ju Shua Jun 16 '16 at 14:46
  • 3
    You might want to add about the scoping mechanisms provided in c++ that allow much of the new and delete to be made mostly-automatic. – whatsisname Jun 16 '16 at 14:55
  • 9
    @whatsisname it is not that new and delete are made automatic, it is that they don't occur at all in many cases – Caleth Jun 16 '16 at 14:58
  • 10
    The `delete` is automatically called for you by [smart pointers](http://stackoverflow.com/questions/106508/what-is-a-smart-pointer-and-when-should-i-use-one) if you use them so you should consider using them every time when an automatic storage can't be used. – Marian Spanik Jun 16 '16 at 15:36
  • @whatsisname Good idea. I have added additional information to my answer, anything else you feel should be included? – Thebluefish Jun 16 '16 at 15:45
  • 2
    I think it's a little bit misleading because also in C when you return from a function you have the stack automatically free. RAII is specific for OOP it automatically frees resources if an object is deleted so you don't have any memory leak if you don't have objects leak. Am I correct? – JoulinRouge Jun 16 '16 at 16:15
  • @JoulinRouge TBH I rewrote about half my answer twice during the editing process, and the mention of RAII was was left over from when I was going to delve a bit further into that subject. I will correct that. – Thebluefish Jun 16 '16 at 16:23
  • 2
    Teaching C++ but writing `void main`... what? Your first snippet also has variable name mismatches and won't compile. You should proof-read your answer before submitting. – Lightness Races in Orbit Jun 16 '16 at 16:30
  • 3
    @LightnessRacesinOrbit I have, unfortunately, been writing in Unreal Engine for the past 6 months. Some concepts get mixed. The variable name mismatches is, as mentioned in a prior comment, a result of me revising a good portion of my answer multiple times. Oops. – Thebluefish Jun 16 '16 at 16:37
  • 11
    @JuShua Note that when writing modern C++, you shouldn't ever need to actually have `delete` in your application code (and from C++14 onwards, same with `new`), but instead use smart pointers and RAII to have heap objects deleted. `std::unique_ptr` type and `std::make_unique` function are the direct, simplest replacement of `new` and `delete` at application code level. – hyde Jun 16 '16 at 17:54
  • 1
    You might mention `free`, since the question asks about C as well as C++. – Jacob Krall Jun 16 '16 at 20:20
  • @JacobKrall Feel free to add any edits. I am not familiar enough with C (10+ years ago), so I cannot provide accurate or comprehensive information. – Thebluefish Jun 16 '16 at 20:23
  • @Thebluefish: Me either! – Jacob Krall Jun 16 '16 at 20:25
  • "At this point, the user must resort to ending the application so that the Operating System can reclaim that used memory." The operating system can start ending applications as well. Linux has the [OOM](https://www.kernel.org/doc/gorman/html/understand/understand016.html), for instance. – jpmc26 Jun 16 '16 at 23:09
  • @jpmc26, sure it can. However that simply frees up more memory for our problem app to consume, not release the memory that we let go as a memory leak. – Thebluefish Jun 16 '16 at 23:12
  • @Thebluefish If the OS is smart, it will kill the app that's most offending, presumably the one causing the biggest leaks and is therefore the biggest problem. ;) – jpmc26 Jun 16 '16 at 23:16
  • @jpmc26 That isn't really all that smart - the OS has no way of knowing whether the process leaks or simply needs a lot of memory. And if it simply needs a lot of memory, it's the *worst* candidate for termination - it's most the thing you want to succeed. – Luaan Jun 17 '16 at 07:57
  • @jpmc26 killing processes is not the only strategy for a session manager. It can also just stop running any threads but one which would prompt for user or system interactions such as providing more RAM or storage for paging. Autofailing memory allocation calls is also reasonable and will allow for execution to continue, although some user functions might not be available. – dbanet Jun 17 '16 at 11:34
  • 1
    @Luaan, it's the memory variant of the Halting Problem. I dub thee: "The Hungry Problem" – Nick Jun 17 '16 at 14:40
  • Would `variableThatIsAPointer` be better named `variableThatNamesAPointer`? Or do I have this concept mixed up. –  Jun 17 '16 at 21:18
  • @JesseSielaff The variable itself is a pointer to an `int`. – Thebluefish Jun 17 '16 at 22:02
  • The entire process doesn't have to die. For example, the apache web server by default runs with a fork model. Each forked copy of the master runs for a few hundred requests and then exits, to be replaced by a fresh one. This clears out any lurking memory leaks. – Zan Lynx Jun 18 '16 at 01:09
  • @ZanLynx that's exactly the "processes die" model -> there is a controller process that presumably only has stack allocations of process handles and a bunch of children that the OS reclaims memory from by terminating (at the request of the controller) – Caleth Jun 24 '16 at 13:47
  • @Thebluefish It would suffice to mention malloc/free as being analogous to new/delete (I think we can dispense with mentions of calloc and realloc) – David Conrad Jun 29 '16 at 18:19
  • @DavidConrad Community Wiki is free for anyone to edit :) – Thebluefish Jun 29 '16 at 18:49
82

C++ does not have garbage collection.

C++ applications are required to dispose of their own garbage.

C++ applications programmers are required to understand this.

When they forget, the result is called a "memory leak".

John R. Strohm
  • 18,043
  • 5
  • 46
  • 56
  • 23
    You certainly made sure your answer doesn't contain any garbage either, nor boilerplate... – leftaroundabout Jun 17 '16 at 17:43
  • 15
    @leftaroundabout: Thank you. I consider that a compliment. – John R. Strohm Jun 17 '16 at 18:02
  • 1
    OK this garbage-free answer does have a keyword to search for: memory leak. It'd also be nice to somehow mention `new` and `delete`. – Ruslan Jun 18 '16 at 17:49
  • 4
    @Ruslan The same also applies to `malloc` and `free`, or `new[]` and `delete[]`, or any other allocators (like Windows's `GlobalAlloc`, `LocalAlloc`, `SHAlloc`, `CoTaskMemAlloc`, `VirtualAlloc`, `HeapAlloc`, ...), and memory allocated for you (e.g. via `fopen`). – user253751 Jun 18 '16 at 22:14
44

In C, C++ and other systems without a Garbage Collector, the developer is offered facilities by the language and its libraries to indicate when memory can be reclaimed.

The most basic facility is automatic storage. Many times, the language itself ensures that items are disposed of:

int global = 0; // automatic storage

int foo(int a, int b) {
    static int local = 1; // automatic storage

    int c = a + b; // automatic storage

    return c;
}

In this cases, the compiler is in charge of knowing when those values are unused and reclaim the storage associated with them.

When using dynamic storage, in C, memory is traditionally allocated with malloc and reclaimed with free. In C++, memory is traditionally allocated with new and reclaimed with delete.

C has not changed much over the years, however modern C++ eschews new and delete completely and relies instead on library facilities (which themselves use new and delete appropriately):

  • smart pointers are the most famous: std::unique_ptr and std::shared_ptr
  • but containers are much more widespread actually: std::string, std::vector, std::map, ... all internally manage dynamically allocated memory transparently

Speaking of shared_ptr, there is a risk: if a cycle of references is formed, and not broken, then memory leak there can be. It is up to the developer to avoid this situation, the simplest way being to avoid shared_ptr altogether and the second simplest being to avoid cycles at the type level.

As a result memory leaks are not an issue in C++, even for new users, as long as they refrain from using new, delete or std::shared_ptr. This is unlike C where a staunch discipline is necessary, and generally insufficient.


However, this answer would not be complete without mentioning the twin-sister of memory leaks: dangling pointers.

A dangling pointer (or dangling reference) is a hazard created by keeping a pointer or reference to an object that is dead. For example:

int main() {
    std::vector<int> vec;
    vec.push_back(1);     // vec: [1]

    int& a = vec.back();

    vec.pop_back();       // vec: [], "a" is now dangling

    std::cout << a << "\n";
}

Using a dangling pointer, or reference, is Undefined Behavior. In general, luckily, this is an immediate crash; quite often, unfortunately, this causes memory corruption first... and from time to time weird behavior crops up because the compiler emits really weird code.

Undefined Behavior is the biggest issue with C and C++ to this day, in terms of security/correctness of programs. You might want to check out Rust for a language with no Garbage Collector and no Undefined Behavior.

Matthieu M.
  • 14,567
  • 4
  • 44
  • 65
  • 18
    Re: "Using a dangling pointer, or reference, is *Undefined Behavior*. In general, luckily, this is an immediate crash": Really? That does not match my experience at all; on the contrary, my experience is that uses of a dangling pointer almost *never* cause an immediate crash . . . – ruakh Jun 16 '16 at 18:57
  • 9
    Yeah, since to be "dangling" a pointer must have targeted previously-allocated memory at one point, and that memory is usually unlikely to have been completely unmapped from the process such that it's no longer accessible at all, because it'll be a good candidate for immediate reuse... in practice, dangling pointers don't cause crashes, they cause chaos. – Alex Celeste Jun 16 '16 at 19:23
  • 2
    "As a result memory leaks are not an issue in C++," Sure they are, there's always C bindings to libraries to screw up, as well as recursive shared_ptrs or even recursive unique_ptrs, and other situations. – Mooing Duck Jun 16 '16 at 20:36
  • @ruakh: immediate is maybe a bit strong, but I've generally been lucky enough to get a nearly immediate crash which made analysis quite easy. I've seen some horrendous crashes in multi-threaded applications (with corrupted malloc internals), but those were few and far between. – Matthieu M. Jun 17 '16 at 06:34
  • 2
    @MooingDuck: I don't count bindings to C here, because ALL languages face the issue. Also, I duly noted that this only applied if one did not use `shared_ptr`. Recursive `unique_ptr` are not an issue, by nature they can only form a DAG. – Matthieu M. Jun 17 '16 at 06:35
  • @ruakh Debug runtime on Windows tends to make errors like this rather obvious - in many cases, you will indeed get a crash. Of course, this comes at a cost to performance, but that's often a good trade-off when debugging. – Luaan Jun 17 '16 at 08:00
  • @Matthieu you may have been running those cases under a debugger, which may detect usage of deallocated memory. – dbanet Jun 17 '16 at 11:39
  • @dbanet: no, I was inspecting the core dumps of production crashes. – Matthieu M. Jun 17 '16 at 11:40
  • 3
    “not an issue in C++, even for new users” – I would qualify that to “new users _who don't come from a Java-like language or C_”. – leftaroundabout Jun 17 '16 at 17:45
  • 3
    @leftaroundabout: it's qualified "as long as they refrain from using `new`, `delete` and `shared_ptr`"; without `new` and `shared_ptr` you have direct ownership so no leaks. Of course, you're likely to have dangling pointers, etc... but I am afraid you need to leave C++ to get rid of those. – Matthieu M. Jun 17 '16 at 18:14
  • 1
    @MatthieuM. There's at least one possibility for even new users to get memory leaks: `union`s with some container, with an incorrectly implemented destructor. – Daniel Jour Jun 18 '16 at 10:29
  • @DanielJour: Ah true, with the extended union of C++11 this is a risk. Hopefully they'll use `boost::variant` instead. – Matthieu M. Jun 18 '16 at 12:09
  • `In general, luckily, this is an immediate crash` This is wrong. Your UB code most likely won't crash, it'll even likely to print `1`. With optimizations it might print some garbage, but still won't likely crash. And, in fact, there's no _direct_ place for memory corruption here, since you're only reading via the dangling reference, not writing (although it still might corrupt memory due to some tricky optimizations). – Ruslan Jun 18 '16 at 17:53
  • The variable `local` has static storage duration, not auto storage duration. – James Youngman Jun 19 '16 at 14:30
27

C++ has this thing called RAII. Basically it means garbage gets cleaned up as you go rather than leave it in a pile and let the cleaner tidy up after you. (imagine me in my room watching the football - as I drink cans of beer and need new ones, the C++ way is to take the empty can to the bin on the way to the fridge, the C# way is to chuck it on the floor and wait for the maid to pick them up when she comes to do the cleaning).

Now it is possible to leak memory in C++, but to do so requires you leave the usual constructs and revert to the C way of doing things - allocating a block of memory and keeping track of where that block is without any language assistance. Some people forget this pointer and so cannot remove the block.

gbjbaanb
  • 48,354
  • 6
  • 102
  • 172
  • 9
    Shared pointers (which use RAII) provide a modern way to create leaks. Suppose objects A and B reference one another via shared pointers, and nothing else references object A or object B. The result is a leak. This mutual referencing is a non-issue in languages with garbage collection. – David Hammen Jun 16 '16 at 19:28
  • @DavidHammen sure, but at a cost of traversing almost every object to make sure. Your example of the smart pointers ignores the fact that the smart pointer itself will go out of scope and then the objects will be freed. You assume a smart pointer is like a pointer, its not, its an object that is passed around on the stack like most parameters. This is not much different to memory leaks caused in GC languages,. eg the famous one where removing an event handler from a UI class leaves it silently referenced and therefore leaking. – gbjbaanb Jun 16 '16 at 21:23
  • 1
    @gbjbaanb in the example with the smart pointers, neither smart pointer *ever* goes out of scope, that's why there's a leak. Since both of the smart pointer objects are allocated in a *dynamic* scope, not a lexical one, they each try to wait on the other one before destructing. The fact that smart pointers are real objects in C++ and not just pointers is exactly what causes the leak here - the *additional* smart pointer objects in stack scopes that also pointed to the container objects can't deallocate them when they destruct themselves because the refcount is non-zero. – Alex Celeste Jun 16 '16 at 23:19
  • 2
    The .NET way is not to *chuck* it on the floor. It just keeps it where it was until the maid comes around. And due to the way .NET allocates memory in practice (not contractual), the heap is more like a random-access stack. It's kind of like having a stack of contracts and papers, and going through it once in a while to discard those that aren't valid anymore. And to make this easier, the ones that survive each discard are promoted to a different stack, so that you can avoid traversing all the stacks most of the time - unless the first stack gets big enough, the maid doesn't touch the others. – Luaan Jun 17 '16 at 08:04
  • @Luaan it was an analogy... I guess you'd be happier if I said it leaves cans lying on the table until the maid come to clean up. – gbjbaanb Jun 17 '16 at 18:11
  • I'm sure you'll be happy to hear that I can find a way to quibble with the C++ side of your analogy, too! :-) In C++, you never actually take the empty beer cans to the trash. Rather, the beer cans automatically self-destruct when you finish drinking out of them. 'Tis really very cool. – Cody Gray - on strike Jun 17 '16 at 18:36
  • Yes, exactly - it suits the metaphor a lot better. Or even "keeps adding cans to the table, until the table is almost full and then the maid comes and cleans up". There's no active action you take in C# to encourage the garbage collection. In the end, it's kind of like being in a restaurant - somebody else takes care of your thrash. Sometimes, you're fine with cooking at home and cleaning up after yourself, sometimes you go to a restaurant. – Luaan Jun 17 '16 at 19:32
26

It should be noted that it is, in the case of C++, a common misconception that "you need to do manual memory management". In fact, you don't usually do any memory management in your code.

Fixed-size objects (with scope lifetime)

In the vast majority of cases when you need an object, the object will have a defined lifetime in your program and is created on the stack. This works for all built-in primitive data types, but also for instances of classes and structs:

class MyObject {
    public: int x;
};

int objTest()
{
    MyObject obj;
    obj.x = 5;
    return obj.x;
}

Stack objects are automatically removed when the function ends. In Java, objects are always created on the heap, and therefore have to be removed by some mechanism like garbage collection. This is a non-issue for stack objects.

Objects that manage dynamic data (with scope lifetime)

Using space on the stack works for objects of a fixed size. When you need a variable amount of space, such as an array, another approach is used: The list is encapsuled in a fixed-size object which manages the dynamic memory for you. This works because objects can have a special cleanup function, the destructor. It is guaranteed to be called when the object goes out of scope and does the opposite of the constructor:

class MyList {        
public:
    // a fixed-size pointer to the actual memory.
    int* listOfInts; 
    // constructor: get memory
    MyList(size_t numElements) { listOfInts = new int[numElements]; }
    // destructor: free memory
    ~MyList() { delete[] listOfInts; }
};

int listTest()
{
    MyList list(1024);
    list.listOfInts[200] = 5;
    return list.listOfInts[200];
    // When MyList goes off stack here, its destructor is called and frees the memory.
}

There is no memory management at all in the code where the memory is used. The only thing we need to make sure is that the object we wrote has a suitable destructor. No matter how we leave the scope of listTest, be it via an exception or simply by returning from it, the destructor ~MyList() will be called and we don't need to manage any memory.

(I think it is a funny design decision to use the binary NOT operator, ~, to indicate the destructor. When used on numbers, it inverts the bits; in analogy, here it indicates that what the constructor did is inverted.)

Basically all C++ objects which need dynamic memory use this encapsulation. It has been called RAII ("resource acquisition is initialization"), which is quite a weird way to express the simple idea that objects care about their own contents; what they acquire is theirs to clean up.

Polymorphic objects and lifetime beyond scope

Now, both of these cases were for memory which has a clearly defined lifetime: The lifetime is the same as the scope. If we do not want an object to expire when we leave the scope, there is a third mechanism which can manage memory for us: a smart pointer. Smart pointers are also used when you have instances of objects whose type varies at runtime, but which have a common interface or base class:

class MyDerivedObject : public MyObject {
    public: int y;
};
std::unique_ptr<MyObject> createObject()
{
    // actually creates an object of a derived class,
    // but the user doesn't need to know this.
    return std::make_unique<MyDerivedObject>();
}

int dynamicObjTest()
{
    std::unique_ptr<MyObject> obj = createObject();
    obj->x = 5;
    return obj->x;
    // At scope end, the unique_ptr automatically removes the object it contains,
    // calling its destructor if it has one.
}

There is another kind of smart pointer, std::shared_ptr, for sharing objects among several clients. They only delete their contained object when the last client goes out of scope, so they can be used in situations where it is completely unknown how many clients there will be and how long they will use the object.

In summary, we see that you don't really do any manual memory management. Everything is encapsulated and is then taken care of by means of completely automatical, scope-based memory management. In the cases where this is not enough, smart pointers are used which encapsulate raw memory.

It is considered extremely bad practice to use raw pointers as resource owners anywhere in C++ code, raw allocations outside of constructors, and raw delete calls outside of destructors, as they are almost impossible to manage when exceptions occur, and generally hard to use safely.

The best: this works for all types of resources

One of the biggest benefits of RAII is that it's not limited to memory. It actually provides a very natural way to manage resources such as files and sockets (opening/closing) and synchronization mechanisms such as mutexes (locking/unlocking). Basically, every resource that can be acquired and must be released is managed in exactly the same way in C++, and none of this management is left to the user. It is all encapsulated in classes which acquire in the constructor and release in the destructor.

For example, a function locking a mutex is usually written like this in C++:

void criticalSection() {
    std::scoped_lock lock(myMutex); // scoped_lock locks the mutex
    doSynchronizedStuff();
} // myMutex is released here automatically

Other languages make this much more complicated, by either requiring you to do this manually (e.g. in a finally clause) or they spawn specialized mechanisms which solve this problem, but not in a particularly elegant way (usually later in their life, when enough people have suffered from the shortcoming). Such mechanisms are try-with-resources in Java and the using statement in C#, both of which are approximations of C++'s RAII.

So, to sum it up, all of this was a very superficial account of RAII in C++, but I hope that it helps readers to understand that memory and even resource management in C++ is not usually "manual", but actually mostly automatic.

Felix Dombek
  • 2,109
  • 1
  • 16
  • 24
  • 8
    This is the only answer that doesn't misinform people nor paint C++ more difficult or dangerous than it really is. – Alexander Revo Jun 17 '16 at 05:44
  • 6
    BTW, it is only considered bad practice to use raw pointer as resource owners. There's nothing wrong about using them if they point to something that is guaranteed to outlive the pointer itself. – Alexander Revo Jun 17 '16 at 05:47
  • 8
    I second Alexander. I'm baffled to see the "C++ has no automated memory management, forget a `delete` and you're dead" answers rocketing above 30 points and getting accepted, while this one has five. Does anyone actually use C++ here ? – Quentin Jun 17 '16 at 08:34
8

With respect to C specifically, the language gives you no tools to manage dynamically-allocated memory. You are absolutely responsible for making sure every *alloc has a corresponding free somewhere.

Where things get really nasty is when a resource allocation fails midway through; do you try again, do you roll back and start over from the beginning, do you roll back and exit with an error, do you just bail outright and let the OS deal with it?

For example, here's a function to allocate a non-contiguous 2D array. The behavior here is that if an allocation failure occurs midway through the process, we roll everything back and return an error indication using a NULL pointer:

/**
 * Allocate space for an array of arrays; returns NULL
 * on error.
 */
int **newArr( size_t rows, size_t cols )
{
  int **arr = malloc( sizeof *arr * rows );
  size_t i;

  if ( arr ) // malloc returns NULL on failure
  {
    for ( i = 0; i < rows; i++ )
    {
      arr[i] = malloc( sizeof *arr[i] * cols );
      if ( !arr[i] )
      {
        /**
         * Whoopsie; we can't allocate any more memory for some reason.
         * We can't just return NULL at this point since we'll lose access
         * to the previously allocated memory, so we branch to some cleanup
         * code to undo the allocations made so far.  
         */
        goto cleanup;
      }
    }
  }
  goto done;

/**
 * We encountered a failure midway through memory allocation,
 * so we roll back all previous allocations and return NULL.
 */
cleanup:
  while ( i )         // this is why we didn't limit the scope of i to the for loop
    free( arr[--i] ); // delete previously allocated rows
  free( arr );        // delete arr object
  arr = NULL;

done:
  return arr;
}

This code is butt-ugly with those gotos, but, in absence any sort of a structured exception handling mechanism, this is pretty much the only way to deal with the problem without just bailing out completely, especially if your resource allocation code is nested more than one loop deep. This is one of the very few times where goto is actually an attractive option; otherwise you're using a bunch of flags and extra if statements.

You can make life easier on yourself by writing dedicated allocator/deallocator functions for each resource, something like

Foo *newFoo( void )
{
  Foo *foo = malloc( sizeof *foo );
  if ( foo )
  {
    foo->bar = newBar();
    if ( !foo->bar ) goto cleanupBar;
    foo->bletch = newBletch(); 
    if ( !foo->bletch ) goto cleanupBletch;
    ...
  }
  goto done;

cleanupBletch:
  deleteBar( foo->bar );
  // fall through to clean up the rest

cleanupBar:
  free( foo );
  foo = NULL;

done:
  return foo;
}

void deleteFoo( Foo *f )
{
  deleteBar( f->bar );
  deleteBletch( f->bletch );
  free( f );
}
John Bode
  • 10,826
  • 1
  • 31
  • 43
  • 1
    This is a good answer, even with the `goto` statements. This is recommended practice in some areas. It's a commonly used scheme to protect against the equivalent of exceptions in C. Take a look at the Linux kernel code, which is chock-full of `goto` statements -- and which doesn't leak. – David Hammen Jun 16 '16 at 19:56
  • "without just bailing out completely" -> in fairness, if you want to talk about C, this is probably good practice. C is a language best used for either handling blocks of memory that came from somewhere else, *or* parcelling out small chunks of memory to other procedures, but preferably not doing both at the same time in an interleaved way. If you're using classical "objects" in C, you're likely not using the language to its strengths. – Alex Celeste Jun 16 '16 at 23:25
  • The second `goto` is extraneous. It'd be more readable if you changed `goto done;` to `return arr;` and `arr=NULL;done:return arr;` to `return NULL;`. Although in more complicated cases there might indeed be multiple `goto`s, starting to unroll at differing levels of readiness (what would be done by exception stack unwinding in C++). – Ruslan Jun 18 '16 at 18:01
2

I've learned to classify memory issues into a number of different categories.

  • One time drips. Suppose a program leaks 100 bytes at startup time, only never to leak again. Chasing down and eliminating those one-time leaks is nice (I do like having a clean report by a leak detection capability) but is not essential. Sometimes there are bigger problems that need to be attacked.

  • Repeated leaks. A function that is called repetitively during the course of a programs lifespan that regularly leaks memory a big problem. These drips will torture the program, and possibly the OS, to death.

  • Mutual references. If objects A and B reference one another via shared pointers, you have to do something special, either in the design of those classes or in the code that implements/uses those classes to break the circularity. (This is not a problem for garbage collected languages.)

  • Remembering too much. This is the evil cousin of garbage / memory leaks. RAII will not help here, nor will garbage collection. This is a problem in any language. If some active variable has a pathway that connects it to some random chunk of memory, that random chunk of memory is not garbage. Making a program become forgetful so it can run for several days is tricky. Making a program that can run for several months (e.g., until the disk fails) is very, very tricky.

I have not had a serious problem with leaks for a long, long time. Using RAII in C++ very much helps address those drips and leaks. (One however does have to be careful with shared pointers.) Much more importantly I've had problems with applications whose memory use keeps on growing and growing and growing because of unsevered connections to memory that is no longer of any use.

David Hammen
  • 8,194
  • 28
  • 37
-6

It is up to the C++ programmer to implement his/her own form of garbage collection where necessary. Failure to do so will result in what is called a 'memory leak'. It is pretty common for 'high level' languages (such as Java) to have built in garbage collection, but 'low level' languages such as C and C++ do not.

xDr_Johnx
  • 1
  • 1