Or is it more like "disposing objects in C++ is really tricky - I spend 20% of my time on it and yet, memory leaks are a common place"?
In my personal experience in C++ and even C, memory leaks have never been a huge struggle to avoid. With sane testing procedure and Valgrind, for example, any physical leak caused by a call to operator new/malloc
without a corresponding delete/free
is often quickly detected and fixed. To be fair some large C or old school C++ codebases might very feasibly have some obscure edge cases which might physically leak some bytes of memory here and there as a result of not deleting/freeing
in that edge case which flew under the radar of testing.
Yet as far as practical observations, the leakiest applications I encounter (as in ones that consume more and more memory the longer you run them, even though the amount of data we're working with is not growing) are typically not written in C or C++. I don't find things like the Linux Kernel or Unreal Engine or even the native code used to implement Java among the list of leaky software I encounter.
The most prominent kind of leaky software I tend to encounter are things like Flash applets, like Flash games, even though they use garbage collection. And that is not a fair comparison if one was to deduce anything from this since many Flash applications are written by budding developers who likely lack sound engineering principles and testing procedure (and likewise I'm certain there are skilled professionals out there working with GC who do not struggle with leaky software), but I would have a lot to say to anyone who thinks GC prevents leaky software from being written.
Dangling Pointers
Now coming from my particular domain, experience, and as one mostly using C and C++ (and I expect the benefits of GC will vary depending on our experiences and needs), the most immediate thing GC solves for me is not practical memory leak issues but dangling pointer access, and that could literally be a lifesaver in mission-critical scenarios.
Unfortunately in many of the cases where GC solves what would otherwise be a dangling pointer access, it replaces the same sort of programmer mistake with a logical memory leak.
If you imagine that Flash game written by a budding coder, he might store references to game elements in multiple data structures, making them share ownership of these game resources. Unfortunately, let's say he makes a mistake where he forgot to remove the game elements from one of the data structures upon advancing to the next stage, preventing them from being freed until the whole game is shut down. However, the game still appears to work fine because the elements aren't being drawn or affect user interaction. Nevertheless, the game is starting to use more and more memory while the frame rates work themselves to a slide show, while hidden processing is still looping through this hidden collection of elements in the game (which has now become explosive in size). This is the sort of issue I encounter frequently in such Flash games.
- I have encountered people saying this does not count as a "memory leak" because the memory is still being freed upon closing the application, and instead might be called a 'space leak' or something to this effect. While such a distinction might be useful to identify and talk about problems, I do not find such distinctions so useful in this context if we're talking about it like it isn't as problematic as a "memory leak" when we're dealing the practical goal of ensuring out software does not hog up ridiculous amounts of memory the longer we run it (unless we're talking obscure operating systems that don't free a process's memory when it is terminated). It is not as though it would be of any comfort to users so upset that the software is constantly working towards unusability to correct their use of terminology here.
Now let's say the same budding developer wrote the game in C++. In that case there would typically be only one central data structure in the game which "owns" the memory while others point to that memory. If he makes the same sort of mistake, chances are that, upon advancing to the next stage, the game will crash as a result of accessing dangling pointers (or worse, do something other than crash).
This is the most immediate kind of trade-off I tend to encounter in my domain most often between GC and no GC. And I actually don't care for GC very much in my domain, which isn't very mission-critical, because the biggest struggles I ever had with leaky software involved haphazard use of GC in a former team causing the sort of leaks described above.
In my particular domain I actually prefer the software crashing or glitching out in many cases because that's at least much easier to detect than trying to trace down why the software is mysteriously consuming explosive amounts of memory after running it for half an hour while all of our unit and integration tests pass with no complaint (not even from Valgrind, since the memory is being freed by GC upon shutdown). Yet that's not a slam on GC on my part or an attempt to say that it's useless or anything like that, but it hasn't been any sort of silver bullet, not even close, in the teams I worked with against leaky software (to the contrary I had the opposite experience with that one codebase utilizing GC being the leakiest I ever encountered). To be fair many members on that team didn't even know what weak references were, so they were sharing ownership of everything left and right and frequently making that sort of mistake I described above with the budding game developer.
Shared Ownership and Psychology
The problem I find with garbage collection that can make it so prone to "memory leaks" (and I'll insist on calling it as such as the 'space leak' behaves the exact same way from user-end perspective) in the hands of those who do not use it with care relates to "human tendencies" to some degree in my experience. The problem with that team and the leakiest codebase I ever encountered was that they seemed to be under the impression that GC would allow them to stop thinking about who owns resources.
In our case we had so many objects referencing each other. Models would reference materials along with the material library and shader system. Materials would reference textures along with the texture library and certain shaders. Cameras would store references to all sorts of scene entities that should be excluded from rendering. The list seemed to go on indefinitely. That made just about any hefty resource in the system being owned and extended in lifetime in 10+ other places in the application state at once, and that was very, very prone to human error of a kind which would translate to a leak (and not a minor one, I'm talking gigabytes in minutes with serious usability issues). Conceptually all these resources did not need to be shared in ownership, they all conceptually had one owner, but the use of GC here tempted developers to share ownership all over the place instead of properly thinking about the distinction between, say, strong references and weak references and phantom references.
If we stop thinking about who owns what memory, and happily just store lifetime-extending references to objects all over the place without thinking about this, then the software will not crash as a result of dangling pointers but it will almost certainly, under such a careless mindset, start leaking memory like crazy in ways that are very difficult to trace down and will elude tests.
If there's one practical benefit to the dangling pointer in my domain, it is that it causes very nasty glitches and crashes. And that tends to at least give the developers the incentive, if they want to ship something reliable, to start thinking about resource management and doing the proper things needed to remove all additional references/pointers to an object which is no longer conceptually needed.
Application Resource Management
Proper resource management is the name of the game if we're talking about avoiding leaks in long-lived applications with persistent state being stored where leakiness would pose serious frame rate and usability issues. And correctly managing resources here is no less difficult with or without GC. The work is no less manual either way to remove the appropriate references to objects no longer needed whether they are pointers or lifetime-extending references.
That's the challenge in my domain, not forgetting to delete
what we new
(unless we're talking amateur hour with shoddy testing, practices, and tools). And it requires thought and care whether we're using GC or not.
Multithreading
The one other issue I find very useful with GC, if it could be used very cautiously in my domain, is to simplify resource management in multithreading contexts. If we are careful not to store lifetime-extending references to resources in more than one place in the application state, then the lifetime-extending nature of GC references could be extremely useful as a way for threads to temporarily extend a resource being accessed to extend its lifetime for just a short duration as needed for the thread to finish processing it.
I do think very careful use of GC this way could yield a very correct, software that isn't leaky, while simultaneously simplifying multithreading.
There are ways around this though absent GC. In my case we unify the software's scene entity representation, with threads that temporarily cause scene resources to be extended for brief durations in a rather generalized fashion prior to a cleanup phase. This might smell a bit like GC but the difference is that there is no "shared ownership" involved, only a uniform scene processing design in threads which defer destruction of said resources. Still it would be much simpler to just rely on GC here if it could be used very carefully with conscientious developers, careful to use weak references in the relevant persistent areas, for such multithreading cases.
C++
Finally:
In C++ I have to call delete to dispose a created object at the end of it's life cycle.
In Modern C++, this is generally not something you should be doing manually. It's not even so much about forgetting to do it. When you involve exception handling into the picture, then even if you wrote a corresponding delete
below some call to new
, something could throw in the middle and never reach the delete
call if you don't rely on automated destructor calls inserted by the compiler to do this for you.
With C++ you practically need to, unless you're working in like an embedded context with exceptions off and special libraries which are deliberately programmed not to throw, avoid such manual resource cleanup (that includes avoiding manual calls to unlock a mutex outside of a dtor, e.g., and not just memory deallocation). Exception-handling pretty much demands it, so all resource cleanup should be automated through destructors for the most part.