Copy-On-Write 101 – Part 2: Putting It to Use

In my last post I gave a quick introduction to copy-on-write data structures; make sure to read that first. It all started with a vague assertion on how copy-on-write is supposed to be a great advance for concurrent programming. I was met with a lot of blank stares, and possibly embarrassed silence. So, instead of merely waving you off, dear reader, as ignorant, I’ll try to educate you on the marvels of copy-on-write (COW) data structures.

In this part I’ll show some areas where COW data structures have proven to be useful, and talk about some practical considerations when implementing COW data structures in real life.

Some applications

An inductor (can be used together with a capacitor to match impedance).

Undo

Often your programs allow the users to manipulate some data: text documents, configuration settings, diagrams, vegetarian pizza menus… any number of things. If the manipulated data is stored in a COW data structure, implementing unlimited undo functionality becomes almost free.

All you need to do is put the different versions of the data on a list. Structure sharing makes sure you don’t waste too much memory for the old versions. Accessing the old versions takes no extra effort, they’re just right there.

Optimization

This appears to be what most people think copy-on-write is for. It is correct that COW can be a useful optimization strategy, but it is by no means the only useful application.

Anyway, as an example, say you have a long string and you want to append something to the string. In many programming languages strings cannot be modified like that efficiently; the space for the string has been preallocated and extending the string may require you to allocate a new larger memory area and move the data there. A COW string typically allows for cutting and pasting pieces to anywhere in the string without having to copy everything.

Software transactional memory

This is a big one. Software transactional memory is like your own private in-memory database. If you haven’t heard of it before, I strongly suggest you to find out soon. Software transactional memory provides everything an ACID database would without the durability part, since you’re dealing with data structures in RAM after all. A big advantage is zero impedance mismatch, among other things. COW data structures are the natural way to implement software transactional memory.

Practical considerations

Automatic memory management (e.g. garbage collection) is a must-have to conveniently deal with COW data structures. In fact, I would make the same argument for almost any kind of programming, but it’s doubly true for COW.

Most of the data structures you’ll be dealing with are going to be some type of directed acyclic graphs built of nodes linked to other nodes, plus possibly some of the “real” data you are storing along with the graph structure nodes. You need to allocate the nodes separately and, because this is COW, you’ll be allocating all the time. Allocating cells from a garbage collected heap is typically extremely fast, so it’s the perfect way to manage memory in the COW world.

phenom

Constant factors

Often, you want to use a COW data structure because you need to keep some old versions around for a while. But you don’t necessarily want to hang on to every version, so you may wonder if the performance hit of a COW data structure is getting too big. In most cases, I wouldn’t worry about it very much.

To update one node in a mutable tree, you start from the root node, then make your way to the node you’re looking for. Once you’ve found that node, you just change it, in-place. In a COW tree, you’ll need to trace back to to the root node and make copies of each node on the way back.

So the tree is traversed twice, once both ways, and that will take about twice the time, right? It turns out that this is not the case on modern computers. When it’s time to copy the path, the data you need is almost certainly already in the L1 cache so revisiting the nodes again takes almost no time at all.

Also, since we’re allocating memory for the new nodes from a garbage collected heap, allocating memory is really fast so that isn’t really going to make a big hit either. For these reasons, the constant factor slowdown for a COW tree vs. a mutable tree can be quite small indeed.

Asymptotic complexity

It is a fact that there are problems which take linear time in an imperative programming language but suffer an unavoidable O(log n) slowdown in a purely functional (and strict) language. These cases seem to be rare in practice, though. There are algorithms even for efficient copy-on-write arrays.

Next: Copy-on-write 101 – Part 3: The sum of its parts — how COW makes concurrent programming possible without your brain turning into pineapple custard.

If you liked this, click here to receive new posts in a reader.
You should also follow me on Twitter here.

Comments on this entry are closed.

{ 3 comments }

Blaisorblade May 3, 2010 at 20:22: OK, I’ve been thinking about using copy-on-write for a shared dictionary (which has probably infrequent updates). My first guess was to use a multi-level hash table, then someone mentioned me Clojure and its log_32(n) hash table trees. Can you suggest me what’s the best algorithm to use?
My requirements are:
- O(1) reads (in practice)
- O(1) changes to existing mappings.
- fast structural updates, especially before the moment the structure is shared
- low constant factors (this is maybe even more important than scalability; I may be wrong, but I don’t expect more than 1000 mappings to be present).
- multithreaded and lockless (well, memory barriers are needed, even if you don’t mention them)
Persistence is not a functional requirement, just an implementation detail of this idea.
Then, I was looking also at Java’s ConcurrentHashMap (which is just partially COW under the hood, but not fully) and Cliff’s Click NonblockingConcurrentHashMap (http://sourceforge.net/projects/high-scale-lib, http://www.azulsystems.com/events/javaone_2008/2008_CodingNonBlock.pdf).
What’s your suggestion?
Ville Laurikari May 5, 2010 at 22:55: Phew, that’s a lot of requirements.
O(1) reads and O(1) updates already rule out COW structures, in theory. COW structures are necessarily trees, and the best you can do is O(log N). In practice, though, especially as you expect to only keep about a 1000 mappings, you can use a tree with a very high branching factor (branching by 10 you get to leaf nodes in 3 hops, branching by 32 with 2 hops).
For only a 1000 mappings, constant factors will definitely be your most important concern.
Multithreaded and lockless spells STM to me. That means no blocking, but updates may fail and you need to retry.
But really, “best” can mean many different things. If absolute speed is what you want, your best bet may be to implement your own hash table tree. You can hand tune it for your particular application and key space, and possibly take some shortcuts that a generic data structure cannot.
If you want to use existing libraries, why not just try a handful and benchmark them all? I’d be interested to see the results :)
vinay October 28, 2011 at 09:21: Post 1 and post 2 related to copy on write game me clear cut view of copy on write data structure !! Thanks !!