Merits of copy-on-write semantics

Question

I am wondering what possible merits does copy-on-write have? Naturally, I don't expect personal opinions, but real-world practical scenarios where it can be technically and practically beneficial in a tangible way. And by tangible I mean something more than saving you the typing of a & character.

To clarify, this question is in the context of datatypes, where assignment or copy construction creates an implicit shallow copy, but modifications to it creates an implicit deep copy and applies the changes to it instead of the original object.

The reason I am asking is I don't seem to find any merits of having COW as a default implicit behavior. I use Qt, which has COW implemented for a lot of the datatypes, practically all which have some underlying dynamically allocated storage. But how does it really benefit the user?

An example:

QString s("some text");
QString s1 = s; // now both s and s1 internally use the same resource

qDebug() << s1; // const operation, nothing changes
s1[o] = z; // s1 "detaches" from s, allocates new storage and modifies first character
           // s is still "some text"

What do we win by using COW in this example?

If all we intend to do is use const operations, s1 is redundant, might as well use s.

If we intend to change the value, then COW only delays the resource copy until the first non-const operation, at the (albeit minimal) cost of incrementing the ref count for the implicit sharing and detaching from the shared storage. It does look like all the overhead involved in COW is pointless.

It is not much different in the context of parameter passing - if you don't intend to modify the value, pass as const reference, if you do want to modify, you either make an implicit deep copy if you don't want to modify the original object, or pass by reference if you want to modify it. Again COW seems like needless overhead that doesn't achieve anything, and only adds a limitation that you cannot modify the original value even if you want to, as any change will detach from the original object.

So depending on whether you know about COW or are oblivious to it, it may either result in code with obscure intent and needless overhead, or completely confusing behavior which doesn't match the expectations and leaves you scratching your head.

To me it seems that there are more efficient and more readable solutions whether you want to avoid an unnecessary deep copy, or you intend to make one. So where is the practical benefit from COW? I assume there must be some benefit since in it used in such a popular and powerful framework.

Furthermore, from what I've read, COW is now explicitly forbidden in the C++ standard library. Don't know whether the con's I see in it have something to do with it, but either way, there must be a reason for this.

score 18 · Accepted Answer · edited May 23 '17 at 11:33

18

Copy on write is used in situations where you very often will create a copy of the object and not modify it. In those situations, it pays for itself.

As you mentioned, you can pass a const object, and in many cases that is sufficient. However, const only guarantees that the caller can't mutate it (unless they const_cast, of course). It does not handle multithreading cases and it does not handle cases where there are callbacks (which might mutate the original object). Passing a COW object by value puts the challenges of managing these details on the API developer, rather than the API user.

The new rules for C+11 forbid COW for std::string in particular. Iterators on a string must be invalidated if the backing buffer is detached. If the iterator was being implemented as a char* (As opposed to a string* and an index), this iterators are no longer valid. The C++ community had to decide how often iterators could be invalidated, and the decision was that operator[] should not be one of those cases. operator[] on a std::string returns a char&, which may be modified. Thus, operator[] would need to detach the string, invalidating iterators. This was deemed to be a poor trade, and unlike functions like end() and cend(), there's no way to ask for the const version of operator[] short of const casting the string. (related).

COW is still alive and well outside of the STL. In particular, I have found it very useful in cases where it is unreasonable for a user of my APIs to expect that there's some heavyweight object behind what appears to be a very lightweight object. I may wish to use COW in the background to ensure they never have to be concerned with such implementation details.

edited May 23 '17 at 11:33

Community

1

answered Nov 26 '15 at 19:12

Cort Ammon

10,840
3
23
32

Mutating the same string in multiple threads seems like a very bad design, regardless of whether you use iterators or the`[]` operator. So COW enables bad design - that doesn't sound like much of a benefit :) The point in the last paragraph seems valid, but I myself am not a great fan of implicit behavior - people tend to take it for granted, and then have a hard time figuring out why code doesn't work as expected, and keep wondering until they figure out to check what is hidden behind the implicit behavior. – dtech Nov 27 '15 at 14:12
As for the point of using `const_cast` seems like it can break COW just as easily as it can break passing by const reference. For example, `QString::constData()` returns a `const QChar *` - `const_cast` that and COW collapses - you will mutate the original object's data. – dtech Nov 27 '15 at 14:14
If you can return data from a COW, you must either detach before doing so, or return the data in a form which is still COW aware (a `char*` obviously is not aware). As for the implicit behavior, I think you're right, there are issues with it. API design is a constant balance between the two extremes. Too implicit, and people start relying on special behavior as though it was de facto part of the spec. Too explicit, and the API becomes too unwieldy as you expose too many underlying details that weren't really important, and are suddenly written into your API spec. – Cort Ammon Nov 27 '15 at 17:39
I believe the `string` classes got COW behavior because the compiler designers noticed that a large body of code was copying strings rather than using const-reference. If they added COW, they could optimize this case and make more people happy (and it was legal, until C++11). I appreciate their position: while I always pass my strings by const reference, I had seeing all that syntactic junk that just detracts from readability. I hate writing `const std::shared_ptr&` just to capture the correct semantics! – Cort Ammon Nov 27 '15 at 17:41
@CortAmmon It could also simply be the absence of move-semantics and (most?) mandatory copy-elision before C++11 which favored a design avoiding new allocations. – Deduplicator Feb 01 '23 at 06:28

score 5 · Answer 2 · 2018-02-04T19:04:21.387

5

For strings and such it seems like it would pessimize more common use cases than not, as the common case for strings is often small strings, and there the overhead of COW would tend to far outweigh the cost of simply copying the small string. A small buffer optimization makes much more sense to me there to avoid the heap allocation in such cases instead of the string copies.

If you have a heftier object, however, like an android, and you wanted to copy it and just replace its cybernetic arm, COW seems quite reasonable as a way to keep a mutable syntax while avoiding the need to deep copy the entire android just to give the copy a unique arm. Making it just immutable as a persistent data structure at that point might be superior, but a "partial COW" applied on individual android parts seems reasonable for these cases.

In such a case the two copies of the android would share/instance the same torso, legs, feet, head, neck, shoulders, pelvis, etc. The only data which would be different between them and not shared is the arm which was made unique for the second android on overwriting its arm.

edited Feb 04 '18 at 19:04

answered Dec 11 '15 at 03:06

This is all good, but it doesn't demand COW, and is still subject to a lot of harmful implicity. Also, there is a downside to it - you may often want to do object instancing, and I don't mean type instancing, but copy an object as an instance, thus when you modify the source object, the copies are also updated. COW simply excludes that possibility, as any change to a "shared" object detaches it. – dtech Dec 11 '15 at 10:35
Correctness IMO should not be "easy" to achieve, not with implicit behavior. A good example of correctness is CONST correctness, as it is explicit and leaves no room for ambiguities or invisible side effects. Having something like this "easy" and automatic never builds up that extra level of understanding of how things work, which is not only important to overall productivity, but pretty much eliminates the possibility of undesired behavior, the reason for which might be hard to pinpoint. Everything made possible implicitly with COW is easy to achieve explicitly as well, and it is more clear. – dtech Dec 11 '15 at 10:39
My question was motivated by a dilemma whether or not to provide COW by default in the language I am working on. After weighting the pro's and con's, I decided to not have it by default, but as a modifier that can be applied to both new or already existing types. Seems like the best of both worlds, you can still have the implicitness of COW when you are explicit about wanting it. – dtech Dec 11 '15 at 10:54
@ddriver What we have is something akin to a programming language with the nodal paradigm, except for simplicity the nodes kind of use value semantics and no reference-type semantics (maybe somewhat akin to `std::vector` before we had `emplace_back` and move semantics in C++11). But we're also basically using instancing. The node system may or may not modify the data. We have things like pass-through nodes which do nothing with the input but just output a copy (they're there for user organization of his program). In those cases, all the data is shallow copied for complex types... – Dec 11 '15 at 12:39
@ddriver Our copy-on-write is effectively a *"make instance unique implicitly on change"* kind of copying process. It makes it impossible to modify the original. If object `A` is copied and nothing is done to it to object `B`, it's a cheap shallow copy for complex data types like meshes. Now if we modify `B`, the data we modify in `B` becomes unique through COW, but `A` is untouched (except for some atomic reference counts). – Dec 11 '15 at 12:45
@ddriver Now for cases where you have a master object `A` and the node graph is outputting `Bs`, `Cs`, `Ds`, etc. -- a bunch of unique objects, we get that kind of ability to modify `A` and have all these update, but that cascade occurs through the nodal graph evaluation. We get a unique output if any of the inputs within the graph are modified. For example, if you have a node that loads a mesh which is fed into a deformer node, if you change the file name input associated with the mesh loader, a new mesh is loaded and fed to the deformer -- so changes propagate through the graph that way. – Dec 11 '15 at 12:49
@ddriver Perhaps a simple analogy is like Photoshop adjustment layers. If you make changes to a base layer below the adjustment layers, the changes propagate and all the adjustments are applied on top of the adjusted image. But each adjustment layer must output a whole new image, avoid affecting the original, to allow this kind of non-destructive editing. For that, to alleviate the burden of making the authors of adjustment layers have to worry about when to make images unique, we just copy them into the adjustment layer (cheap/shallow), and the adjustment layer can just change whatever (COW). – Dec 11 '15 at 12:57
@ddriver To make that efficient without transferring the complexity to the adjustment layer writers, we effectively make it so only portions of the image which are modified by the adjustment layer are copied (a huge image is split into small tiles, and each tile is only deep copied if the portion of the image it represents is modified by the adjustment layer through a COW process, e.g.). That's how we do it for huge images which are modified non-destructively (not sure how PS does it). – Dec 11 '15 at 13:01
@ddriver The first thing we wanted there, using this adjustment layer analogy, is to have a firm guarantee that adjustment layers will not touch their inputs (the layers below). That's required to safely implement non-destructive editing, it should never touch the original input, only output a new modified image. So we copy the images into the adjustment layer heavy-handedly to force this guarantee. The next step was making the image copying a cheap shallow copy unless/until the image is actually modified by something, at which point COW came to the rescue. Hope that makes sense! – Dec 11 '15 at 13:14
@ddriver One last thing (hope I'm not spamming too much!) is that, as you rightly pointed out, we cannot do *destructive* instancing with this system. If we have a bunch of copies of `A` in memory and want to modify `A` and have the changes cascade to all the copies, we can't -- `A` would become unique. But our software is designed to be purely non-destructive, so we don't rely on the copying process to do instancing. Instead we do non-destructive instancing through this graph -- if we want changes to A to propagate, we just modify the node that outputs A and the whole graph is affected. – Dec 11 '15 at 13:21
@ddriver The `B's` and `C's` and `D's` and so forth which input `A` are then regenerated when `A` changes, and how much work they each have to do to generate all these outputs is proportional to how much they change in their copied inputs due to COW. COW is there so that we never make instanced data more unique than it needs to be without transferring an optimization burden to the authors of nodes. – Dec 11 '15 at 13:24
All this you talk about doesn't really require COW - have that object referenced as a const reference. Then depending on whether that proxy object (adjustment layer) does anything or not, you can either directly relay the data from the const reference, or allocate result buffer and compute the result. This is the better approach at doing something like that rather than using COW. If an adjustment layer used COW, it would break its connection to the original resource, so you can only modify the copied resource so you cannot effectively alter the adjustment layer once it modified the data. – dtech Dec 11 '15 at 13:27
@ddriver *"it would break its connection to the original resource"* that's the desired effect in all cases for us. An adjustment layer which modifies its underlying image would be a bug in our case, because it would no longer be a non-destructive adjustment layer -- it would be causing side effects in the underlying layer. At least with the way we approached it, we wanted to make this kind of bug impossible -- the adjustment layer has to break its connection to the input image always, and output a whole new one. The next step was making that cheap. – Dec 11 '15 at 13:29
@ddriver But it does make it so we do have an API where we have to have like a `commit` mentality to it. `begin modifying this portion of the rectangle of the image (get back pixel proxy), change image, commit.` It has to all be done in a transactional mindset. But the transactions also help us with the undo system outside the node graph where we do destructive editing, so it becomes kind of a burden that was necessary anyway. We also lean on that transactional mindset for exception safety/error recovery to do clean rollbacks. – Dec 11 '15 at 13:31
"that's the desired effect in all cases for us" it may be for you, but it does not apply in your adjustment layer analogy. It would be a bad design if an adjustment layer ever detaches from its source layer, it is even wrong to semantically use COW to implement that, since the adjustment layer is not a shallow copy but a modifier to the source layer. All in all, the gains from using COW are negligible, and not worth the loss of control and flexibility. It won't take much to achieve the same thing explicitly and retain the control and flexibility. – dtech Dec 11 '15 at 13:33
@ddriver There is never a case where an adjustment layer should ever modify the original image. Imagine stacking a brightness/contrast layer on top of a digital painting you made. It should never tamper with your digital painting, only output a whole new image. There is no tie to the original that's interesting from the adjustment layer perspective except immutable source -- const reference as you say. But then a const reference would imply that we *always* make a copy anyway to output something new. – Dec 11 '15 at 13:36
@ddriver To reduce the burden of having to manually copy the const reference and then only make as much unique data in the process of copying as the amount of data modified, we use COW. Especially since we allow things like making only *portions* of an image unique if only a portion of the image is modified, that would typically transfer a heavy optimization burden to the author of the adjustment layer. COW makes it so we absorb that burden in one central place, in the image class itself. – Dec 11 '15 at 13:37
@ddriver Basically our COW process is a lot more complex than just copying the entire image, e.g. It's a transactional process which only modifies image tiles for portions of the image that changed. So it's not like it's saving 1 line of code in hundreds of adjustment layers, so to speak (otherwise we wouldn't bother). It's saving hundreds of lines of code in each adjustment layer to make the image class responsible for only making data unique that is actually modified by the adjustment layer. COW is used to absorb that complexity in one central place instead of for each adjustment layer. – Dec 11 '15 at 13:45
"There is never a case where an adjustment layer should ever modify the original image" - whenever did I state otherwise? I said the relation between an adjustment layer and its source should be explicitly non-mutable. COW doesn't really save you anything, as in BOTH cases you'd pass a reference, in the explicit scenario it will serve as a const reference to the source, in the COW scenario it will serve as a source for the initial shallow copy. Sorry but your claims of "hundreds of lines of code" are unrealistic, code can be reused whether you use COW or an explicit approach. – dtech Dec 11 '15 at 13:46
I understand your motivation, I just don't think such use case justifies or even merits COW - the benefits are minimal and you lose fine grained control, which is a significant cost. You still can use COW to achieve it, it is just not the optimal solution. – dtech Dec 11 '15 at 13:50
@ddriver Ah yes, sorry, I just wanted to make sure that part was clear. So this is going to be hard to explain without getting into a lot of subtle details, but in our case the COW process is done for a complex "cyborg", if you will. If the non-destructive node (adjustment layer or whatever) only touches the cyborg's arm, only his arm is made unique. If it only touches his legs and head, only those are made unique. The difference between COW and no COW from a deep copying standpoint is that deep copying has to be a bit more explicit otherwise. – Dec 11 '15 at 13:50
@ddriver In our case, just given the way we've set it all up, it would be hard to write a non-destructive node that has to explicitly say, "Make this arm and head unique because I'm going to modify it." We ended up with easier code to write by saying, "We're going to modify the arm and head -- please system, do your thing to make those unique, I don't want to think about it." – Dec 11 '15 at 13:51
That's what I've been saying - with COW you lose the option to chose, as it makes the choices for you, and every non-const operation would detach from the resource and duplicate it. What if you want to modify the original resource? What if you want only particular mutable operations to modify the original source? Sure, your usage scenario might exclude those possibilities to begin with, but those are nice things to have, and can be very useful. With COW you have to give up on that - you have no choice, and in my book "no choice" goes in the "bad things" category. – dtech Dec 11 '15 at 13:55
@ddriver I see, yes -- in our case we had a particular scope where that part of the requirement (that this is what we always need to do) for the way we designed the app was firm/stable -- something we could wholeheartedly commit to. Our difficulty/struggle was making every node easy enough to write and efficient enough to evaluate and reevaluate repeatedly in real-time given those firm requirements. – Dec 11 '15 at 13:56
@ddriver It might have been a failing on our part, but given how this process of making only the parts of the data that were touched in a very complex input (a mesh, a tiled image, etc) unique was pretty elaborate, we couldn't find a way to make that easy without making the decision of when to make data unique an implicit part of the class's central implementation. It was too hard given the alternatives we could see otherwise (and explored) to make that a responsibility in each and every node. – Dec 11 '15 at 13:58
@ddriver Ease and safety of writing each node was our number one factor, since that's what the team was doing daily, making and maintaining a lot of nodes. In that case, when a node is handed a complex aggregate input like a cyborg/android, we wanted to make it so we can just let them tamper with the head and arms, e.g., but not have to worry about making those unique, and not have to worry about shallow copying the torso, legs, feet, hands, etc. What is shallow copied and what is made unique is done automatically based on what they change. – Dec 11 '15 at 14:01
It may seem like a great thing being able to do things without giving it too much thought to what you are actually doing, but in lots of cases this is a recipe for disaster. I'd rather make those decisions manually, this is the only safe way to make sure you always get the correct behavior, and a little extra thought has never hurt anyone, actually it is a very useful mind exercise. – dtech Dec 11 '15 at 14:02
@ddriver The way we're tackling it is, put simply, `android modify_arms_and_head(const android& a);` Now implement the function where you modify only the arms and head of `a`, shallow copy the hands, feet, torso, legs, feet, eyeballs, brain, heart, etc. but make it as easy to write as possible. And return it. All untouched data should be shallow copied, only touched data should be made unique. – Dec 11 '15 at 14:05
@ddriver Now write similar code 1000 times over, but some which only modify eyeballs and toes, some which modify only legs and heart, etc. If you have any solution there which can achieve anything close to the minimum amount of logic required without COW where deep/shallow copying is explicit instead of implicit, that would be awesome and we might explore that in the future. – Dec 11 '15 at 14:06
So if I want an android with 1000 modifications, it ends up creating 1002 androids, just to have 1000 of them thrown away? That seems like a lot of overhead. I would not do that. I don't see why a modifier should return an object, a modifier is only applied to an object, whether it is the original or a copy - it is best to be explicit about it. The modifier can implement a variety of ways to achieve non-destructive editing, many of which more efficient than a full deep copy. As I said before - COW is a rather lazy, inflexible and inefficient way to go. It works, but it is not optimal. – dtech Dec 11 '15 at 14:14
@ddriver I put the best analogy I could into the updated answer! – Dec 11 '15 at 14:24
I see, so you have a lot of nested COWs. However I don't think your attempt to illustrate on code saving is a realistic one. You could have just `Android new_android(a);` and only replace the needed components explicitly, you can still have shallow copy and resource sharing implicitly. Suddenly the difference in code is minimal, and it serves well to indicate the actual user intent - which is important. Because let's be honest, it is not the most intuitive thing when you have such behavior implicit - you drink from a glass and expect it to be empty... but somehow it is still full... why? – dtech Dec 11 '15 at 14:35
@ddriver Yeah, we have a very specific use case for this analogical Android. We personally found it so much easier to just treat this node graph's data as immutable, and transfer the burdens of deciding what to shallow copy or not into the class. A single input to a node can be a complex non-homogeneous aggregate like this `Android` here. That said, I'd love to find another way to do this that could alleviate the burden. You're right in that it does limit some of the flexibility of this `Android` class to assume, in its internal implementation, what parts of it to shallow copy and... – Dec 11 '15 at 14:40
@ddriver ... what parts to make unique on change requests. It's just the most productive way we've found to do this so far! – Dec 11 '15 at 14:41
Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/32855/discussion-between-ike-and-ddriver). – Dec 11 '15 at 14:42

score 3 · Answer 3 · answered Jan 27 '23 at 08:07

3

The point is that COW has zero cost when neither copy is changed, and little additional cost when one copy is changed.

Swift uses COW for strings, arrays and dictionaries. Each of these is implemented as a struct with a pointer to a data object, and passing one as a parameter or assigning copies the tiny struct and increases the reference counter of the data object. Then if either original or copy are modified, their data object is copied. The copy has a reference count of 1, the unchanged one has a reference count decreased.

In addition, there is a class for substrings which share the data with the original, plus the bounds of the substring, so “substring starting at index 3” doesn’t allocate new data. The [] operator either returns a character or changes a character so returning a character doesn’t trigger copying.

And strings are often large. It’s not unusual to read a multi-megabyte file into a string.

answered Jan 27 '23 at 08:07

gnasher729

42,090
4
59
119

Atomic ref counting is not zero cost. – dtech Jan 27 '23 at 16:14
Dtech, it is cheap when you know what you are doing. And when the interesting and slow case only happens _once_. – gnasher729 Jan 28 '23 at 09:16
It is the same costs that has no way of knowing whether or not you know what you are doing. I corrected you that it is not a zero cost, there's no need to reinforce your false statement with vague insinuations. – dtech Jan 28 '23 at 13:58
“You” is whoever implements copy-on-write, in this case the authors of the Swift standard library. They know what they are doing. Atomic operations are very fast if the counter is never seen by another thread which is most typical. And since almost all iOS software uses COW extensively, you can argue with Apple if you like. – gnasher729 Jan 28 '23 at 16:39
I think the correct thing to say is "COW is zero cost on a system that's already using reference-counting anyway". E.g. adding CoW containers to CPython would be zero-cost. They already have ref counting anyway. – Alexander Feb 01 '23 at 00:07

Anti Gamer · Answer 4 · 2023-02-01T00:19:29.700

Optimizing Away the Need To Perform Expensive Deep Copies

It can be very useful for multithreading but not in the type of example you provided. For small strings, it would be so much more efficient to avoid any ref counting/GC and just deep copy and ideally with a small buffer optimization as I'm sure you realize.

But consider a game example where you have systems that want to operate in a parallel pipeline like a physics system, AI system, rendering system, etc, all inputting a game scene and producing a modified output so that the physics system can be working on frame 2 while the rendering system is still rendering frame 1. Very few game engines avoid a serial pattern here across systems (they might multithread the work done within a system like with parallel loops, but not achieve a parallel pipeline across systems)[*].

Most gamedevs I've talked to on the game exchange section of stack exchange, including very high-profile ones, consider it too much trouble than it's worth to even double-buffer game state to allow even two system to run in parallel, not to mention triple-buffering to allow three, and quadruple-buffering for four, and so forth. But they might not have considered copy-on-write data structures which can trivialize the effort.

Massive, Shared, Mutable Data

Allowing these systems to operate in parallel means that they cannot share mutable data without locks. They can have their own thread-local copy of unshared mutable data, or shared immutable data, but they cannot share mutable data without locking and bottlenecking other threads on access. Avoiding either the sharing or the mutability (we only need one to avoid bottlenecking threads) may or may not be tricky depending on the game engine.

With basic game engines that deal mostly with unchanging scene data, we can probably easily and cleanly separate the immutable scene data that can be freely shared without locking from the small subset that is mutable which can be locally deep copied for each thread to avoid the sharing (and as John Carmack pointed out, this may only have to be some megabytes for many games, not hundreds of megabytes or gigabytes). The design constraints allow that for most orthodox game engines which don't even have the possibility of mutating much scene data per frame.

For example, most game engines don't even offer the ability to freely mutate hefty character models for bone deformations or facial animations in any arbitrary frame while the game is running (only their animation parameters, like bone matrices). Instead they generate the deformed version on the fly in a vertex shader and so the bulk of the hefty game data is immutable and easy to separate from the mutable given the heavy engine-imposed restrictions on what's allowed to change per frame.

Designs That Cannot Anticipate What Will Mutate

Yet consider a very innovative game doing things so differently from orthodox AAA engines that uses a freely-destructible voxel environment with voxels much smaller than Minecraft, maybe even close to pixel resolution or less at normal viewing distances. But the sheer amount of environment destructibility means that almost all the hefty data of the scene is mutable and can be changed by user input at any given time. Here even generating the results in a shader would still require treating enormous amounts of data as mutable per frame, as the input parameters are no longer simple matrices or vectors or scalars affecting things at a whole model level, but would be parameters affecting things at the individual voxel level with billions of voxels.

That's going to involve a scene that might easily require hundreds of megabytes to gigabytes of data that could be mutated at any given time (we cannot possibly anticipate what might change in a frame given such user freedom) per deep copy of the scene, even with a very efficient sparse voxel octree that can compress voxels down to less than a byte in size. What would normally have to be treated as inevitably shared and mutable data in this case is enormous, and eliminating the sharing of the mutable data via thread-local deep copies might require deep copying this enormous amount of data in close to its entirety per thread per frame (which might easily require more time just spent copying than the thread requires to do its thing with it not to mention the explosive memory use).

Automating Away the Sharing of Mutable Data With COW

Copy-on-write in this case comes to the rescue where we cannot possibly anticipate what will be mutated of this enormous scene in advance as it automatically avoids modifying the shared, immutable shallow copy of the parts of the scene which have not been modified. If one thread -- like the physics thread working on frame 4 while the AI system is working on frame 3 while the rendering system is working on frame 2 -- wants to modify a small section of the scene, only that small section of the scene that is requested to be modified is deep-copied on write keeping the other threads able to churn away and do their thing while keeping regular copying relatively dirt cheap and shallow (at least for data that spans hundreds of megabytes or more).

Writing/mutation becomes a bit more expensive as a result of the atomic ref counting or GC but very often at least in the types of scenarios I deal with, a thread might only need to modify 1 megabyte worth of data while the scene spans an entire gigabyte. It's more than a worthwhile exchange and instead a fantastic bargain to avoid having to deep copy the entirety of that scene data at the relatively trivial expense of some atomic operations and small, partial deep copies of the smallest subset of the scene to avoid what would otherwise be hundreds of megabytes to gigabytes of data deep copied per-frame per-thread.

Conclusion

So apologies for the long-windedness, but this is at least one use case where I think copy-on-write has to be both the most efficient and elegant solution: in cases where the shared mutable data is too massive to be deep copied left and right for each thread for every single frame but only a subset of it is actually modified per thread per frame (but in ways that are impossible for designers to anticipate in advance).

A Note

It's also worth noting that all persistent data structures in functional languages use copy-on-write behind the scenes. That's how a PDS is implemented. They may not expose the mutable interface to the users in ways we might want to do in an imperative language like C++, but under the hood it's always COW. So fans of languages like Haskell or Clojure are at least using COW all over the place under the hood at the implementation level, even if they're not exposed to it and only dealing with conceptually read-only interfaces to immutable data structures.

Merits of copy-on-write semantics

4 Answers4