two phase locking is an old method of ensuring serializable transactions using locks.
a common issue with non-serializable isolation (which is typically what most rdmses support…) in the fact of concurrent writes is the lost update problem.
here’s an example of a lost update, lets assume:
- a a record in a database with column
x
that has a value of 5 - there are two independent processes that will perform a read-modify-write on that record
- the two processes read the current value at the same time. assuming readers don’t block readers, they’ll read the same value
- process A will attempt to read the current value of 5 and increment it by 2
- process B will attempt to read the current value of 5 and increment it by 5
- process A writes first and the new value of x is 7
- process B writes last and the new value of x is 10. The update from process A is considered lost / as if it never happened!
some databases will detect a write conflict like this and raise an error and others may just proceed without a hiccup. In either case, there’s a potential conflict that may need to be resolved depending on the expectations of a client. If this is a bank account, a lost update is a huge problem. If this is a view counter, maybe it’s not as big of a problem.
(conservative) two phase locking
there’s two types of locks in two phase locking (2PL): shared locks and exclusive locks. exclusive locks block both reads and writes. only shared locks can be shared. In snapshot isolation levels, exclusive locks from writers only block other writers – readers can still read records that are exclusively locked. with serializable, writers can block readers.
the reason 2PL is able to guarantee serializability is by splitting it locking behavior into two phases – one before actual query execution and the other after.
prior to actually executing the query, the query planner figures out what locks it needs to obtain. If a transaction only does a read, only a shared lock is acquired for that record. If, however, a transaction does a read of a record followed by a write to the same record, the shared lock is “upgraded” to an exclusive lock.
once all the locks are obtained (lock point), the locks are held until execution is complete. If you have two concurrent transactions, the first one with an exclusive lock to reach the lock point will block the other. This is what ensures serial execution.
and finally, the reason this is sometimes referred to as conservative 2PL is that it obtains all the locks upfront regardless of whether or not it needs to.
Leave a Reply