The Tesseract Project A blog of temporary obsessions

Lamport - A Fast Mutual Exclusion Algorithm

I recently came across (again) a copy of Lelie Lamport's 1987 paper, "F Fast Mutual Exclusion Algorithm"1. It's a gem of a paper.

What do you do if more than one program on a computer wants to use the same resource? That's the problem of mutual exclusion. With a single processor, it's easy: when a program needs the exclusive access to the resource, interrupts can be turned off so that it can finish the critical section of its work. Then another program can use it.

If you're building a system with multiple processors, it's much harder. The problem was first formulated, along with a solution, by Edsger Dijkstra in 19652. In practice, multiprocessor machinese were typically built with an atomic test-and-set instruction, which also makes it pretty simple. In the 70s, as experimentation with multiprocessor systems evolved, there was a fair bit of research on things like mutual exclusion with things like message passing, since shared memory wasn't necessarily available.

Lamport joined Digital Equipment Corporation's Systems Research Center (SRC) in Palo Alto in 1985. Not too long after he got there, some of the researchers at DEC's Western Research Lab (WRL), also in Palo Alto3, were designing a new multiprocessor system (presumably Multi-Titan), and they were wondering if there was a way to avoid adding synchronization instructions to the CPU in single processor system (Titan) they had already designed. Earlier work on mutual exclusion had paid a lot of attention to avoiding starvation—the idea that the critical resource might be so busy that a process would never get access to it unless the algorithm guaranteed it. Based on experience with multiprocessor systems, the WRL team expected that contention for critical sections was actually rare in practice, so maybe there was a way to do it fast enough using shared memory without special CPU instructions.

Software developers often think of theoretical computer science theory as being fairly removed from what they do every day. In fact, there's an enormous amount, mostly invisible, of the software world built on decades of work in theory. And sometimes, you encounter a "real world" problem that actually raises an interesting question that can be answered with a theoretical approach.

This was one of those times.

So Lamport proceeds to figure it out, finding the lower bound on the number of memory operation required and an optimal algorithm to do it. In the paper, he walks through a process of deriving the algorithm from the constraints of the problem, which gives a clear understanding of each step in the algorithm, why it's there, and why you can't get rid of it. The paper goes on to give formal proofs of the algorithm's correctness and that it won't deadlock.

Lamport's papers are usually very clear and, for the material, easy to follow. This includes the proofs, although not every reader is interesting in going through the level of detail that proofs require. The description of deriving the algorithm is a polished diamond of presentation.

In the end, the algorithm wouldn't provide the performance the WRL team wanted, and they ended up adding the instruction. But they were able to do that with the knowledge that they didn't really have the choice—there wouldn't be any regrets or wondering what they might have done if they had just been smarter—and they could clearly justify the extra cost in the project.

Leslie Lamport won the Turing Award in 2013, and is also well-known for creating LaTeX. He also keeps a nice list of his publications with some commentary about them.4

As a side note, this is an interesting example of the challenge of assigning dates to papers. It was originally dated November 14, 1985, revised October 31, 1986, published in TOCS in February, 1987, and, judging by the copyright date, issued as a technical report by Digital Equipment Corporation in 1988.


1

Leslie Lamport, A Fast Mutual Exclusion Algorithm, ACM Transactions on Computer Systems 5(1), February 1987, pp. 1-11. Technical report version (revised 1986) at https://lamport.azurewebsites.net/pubs/fast-mutex.pdf

2

E. W. Dijkstra. 1965. Solution of a problem in concurrent programming control. Communications of the ACM 8, 9 (September 1965), DOI=http://dx.doi.org/10.1145/365559.365617 10.1145/365559.365617

3

About two blocks away. Why were there two research labs there at the same time? That's another story.