Understanding Mutexes
2019-10-22
Those of us familiar with concurrency and parallelism know just how useful a mutex can be when protecting access to resources. But how do they work?
Here is a basic example, in Go, of a struct with a count
field and a method that increments this field when called:
package main
import "sync"
type protected struct {
count int
mtx sync.Mutex
}
func (p *protected) inc() {
p.mtx.Lock()
p.count++
p.mtx.Unlock()
}
As you can see, we lock the mutex, increment the counter, then unlock the mutex before returning. But at no point do we associate the mutex with a specific field. So how exactly does it protect access to p.count
?
While many other languages have similar implementations, for the purposes of this post I'm going to refer to a mutex in Go.
Lock/Unlock
Under the hood, when Lock()
is called, the mutex will attempt to perform an atomic compare-and-swap operation on an unexported int32
field. As the default "unlocked" value of this field is 0, if successful, this field will now equal 1 and we can consider the mutex to be locked. Similarly, Unlock()
attempts to set the field back to 0.
So what happens if we attempt to lock a mutex that is already locked? This depends on the current mode of the mutex, which is one of either normal or starvation.
Normal
In normal mode, calls to Lock/Unlock are queued in FIFO order. Each call will continually attempt to operate the mutex until eventually failing if it has been trying for more than 1ms - at which point it enters starvation mode.
Each outstanding call will compete with other calls for ownership of the mutex. It is quite common for newer calls to succeed first as they are already running on the CPU.
Starvation
In starvation mode, ownership of the mutex is handed off to the caller waiting at the front of the queue. Newer callers don't try to acquire the mutex even if it appears to be unlocked and instead of continually attempting to operate the mutex, they will queue themselves at the end of the queue.
If a caller sees that it is last in the queue, or it has waited for less time than the 1ms timeout, the mutex is set back to normal mode.
Considerations
- Attempting to unlock an unlocked mutex will result in a panic.
- It is possible to bypass a mutex if you have direct access to the resource memory, e.g., a pointer.
- Don't hold a mutex while performing long running tasks, e.g., IO-based operations.
- In some cases is can be more efficient to use a channel to protect access to shared resources.