View Mode
DDIA ยท Ch.826 min read

Transactions: The Safety Net That Makes Concurrent Databases Sane

Ch.8 of DDIA 2nd ed. is the definitive treatment of transaction isolation: the full taxonomy of anomalies (dirty reads/writes, lost updates, read skew, write skew, phantoms), the four isolation levels, MVCC internals, and the three mechanisms for achieving serializability (serial execution, 2PL, SSI). This distillation covers all the key concepts and tradeoffs without the textbook scaffolding.

What ACID Actually Means

ACID is the marketing term for what transactions guarantee, but the definitions are slipperier than they sound. The letter that matters most โ€” and varies the most between databases โ€” is I (Isolation).

AAtomicity

A transaction either fully commits (all writes visible) or fully aborts (no writes visible). If the database crashes mid-transaction, it rolls back any partial writes on recovery. The name is unfortunate โ€” "abortability" would be more accurate.

CConsistency

The database starts in a valid state and ends in a valid state. This is actually a property of the application โ€” the application must define what "valid" means using constraints and invariants. Consistency is not really a database guarantee.

IIsolation

Concurrently executing transactions are isolated from each other โ€” as if they executed serially. The actual isolation level varies significantly (see table below). This is the dimension that really matters for concurrent systems.

DDurability

Once a transaction commits, its data is safe even if the database crashes immediately after. Achieved via write-ahead logging, fsync to disk, or replication to multiple nodes. Durability is a spectrum โ€” more replication = more durable.

The "C" Problem: When a distributed database vendor says "we provide ACID transactions," ask specifically which isolation level they provide. "Read committed" is the default in PostgreSQL, Oracle, and SQL Server โ€” it is nowhere near serializable. The ACID label without specifying isolation level is nearly meaningless.

The Four Isolation Levels

The SQL standard defines four isolation levels. Most databases implement read committed and snapshot isolation; true serializable isolation is rare and usually requires opting in.

Isolation LevelGuaranteePreventsAllowsUsed In
Read UncommittedNo protection โ€” transactions can read uncommitted writes from other transactionsNothingDirty reads, dirty writes, lost updates, read skew, write skew, phantomsRarely used; some legacy systems for read performance
Read CommittedOnly read committed data; only overwrite committed dataDirty reads, dirty writesLost updates, read skew (non-repeatable reads), write skew, phantomsDefault in PostgreSQL, Oracle, SQL Server, DB2
Snapshot Isolation (Repeatable Read)Each transaction sees a consistent snapshot of the database as of its start time (MVCC)Dirty reads, dirty writes, read skewLost updates (some DBs detect), write skew, phantoms in read-write transactionsPostgreSQL (as "repeatable read"), Oracle (as "serializable"), SQL Server, MySQL InnoDB
SerializableAll transactions appear to have executed one at a time in some serial orderAll anomalies โ€” dirty reads, dirty writes, lost updates, read skew, write skew, phantomsNothing โ€” strongest isolation levelPostgreSQL (SSI), MySQL InnoDB (2PL), VoltDB/H-Store (serial execution), IBM Db2

Naming Chaos: PostgreSQL calls snapshot isolation "repeatable read." Oracle calls snapshot isolation "serializable." MySQL's "repeatable read" is a weaker MVCC variant that doesn't detect lost updates. IBM Db2 uses "repeatable read" to mean serializability. The SQL standard's definition of isolation levels is widely considered flawed and ambiguous โ€” as a result, nobody really knows what "repeatable read" means.

Read Committed: The Pragmatic Default

Read committed makes two guarantees: you will only read data that has been committed (no dirty reads), and you will only overwrite data that has been committed (no dirty writes). It is the default isolation level in PostgreSQL, Oracle, SQL Server, and DB2.

Dirty writes are prevented by holding a row-level lock when a transaction modifies a row โ€” other transactions trying to modify the same row must wait until the first transaction commits or aborts.

Dirty reads are prevented without a lock: databases simply maintain two versions of each written row โ€” the old committed value and the new in-progress value. Read requests see the old value until the writing transaction commits, then switch to the new value. This avoids blocking readers entirely.

Snapshot Isolation and MVCC

Read committed still allows read skew: if you read two rows in the same transaction, a concurrent transaction may commit between those reads, making the two rows reflect different points in time. This is harmless for most queries but disastrous for long-running operations like backups (which need a consistent snapshot) or analytics queries (which compute aggregates across many rows).

Snapshot isolation solves this: each transaction reads from a consistent snapshot taken at the start of the transaction. It will never see changes made by concurrently running transactions. The implementation is Multi-Version Concurrency Control (MVCC).

How MVCC Works

Every row in the table has two hidden fields: inserted_by (the transaction ID that created this version) and deleted_by (the transaction ID that deleted/overwrote it โ€” initially null).

An update is internally implemented as a delete-and-insert: the old row version is marked deleted_by = [updating transaction], and a new row version is inserted with the updated value. Multiple versions of the same row coexist in the table as a linked list.

A background garbage collection process periodically removes row versions that are no longer visible to any active transaction.

MVCC Visibility Rules

At the start of each transaction, the database snapshots the set of in-progress transactions. A row version is visible to this transaction if and only if:

1.

Writes by transactions that were in-progress (not yet committed) when this transaction started are invisible

2.

Writes by transactions with a higher transaction ID than this transaction's start snapshot are invisible

3.

Writes by aborted transactions are always invisible (garbage collector removes them later)

4.

All other writes โ€” by committed transactions with IDs lower than this snapshot โ€” are visible

Indexes and MVCC

Index entries point to one of the versions of a row. Queries must iterate over versions to find one that is visible. When GC removes old versions, corresponding index entries are removed. An alternative (used in CouchDB, Datomic, LMDB) is immutable B-trees via copy-on-write: each write transaction creates a new B-tree root, which is a consistent snapshot at that point in time. No visibility filtering needed โ€” subsequent writes can't modify existing B-tree pages.

The Full Anomaly Taxonomy

There are six distinct concurrency anomalies. Understanding them precisely is essential because each requires a different isolation level or mitigation strategy.

Dirty Write

Fixed by: Read Committed

Transaction B overwrites a value that transaction A has written but not yet committed. A then aborts, leaving B's state inconsistent.

EXAMPLE: Two users simultaneously try to buy the same item โ€” both read "available," both update "sold to X" โ€” one overwrites the other.

Solution: Hold a row-level lock on writes until commit/abort

Dirty Read

Fixed by: Read Committed

Transaction B reads data written by transaction A before A commits. If A aborts, B has seen invalid data that was rolled back.

EXAMPLE: Reading a bank balance mid-transfer โ€” seeing $0 while money is in-flight between two accounts.

Solution: Database keeps the old committed value; reads see the old value until commit

Lost Update

Fixed by: Snapshot Isolation (with detection) or Serializable

Two concurrent read-modify-write cycles โ€” the second overwrites the first without seeing it. One update is silently discarded.

EXAMPLE: Two processes both read a counter at value 5, both add 1, both write 6 โ€” one increment is lost.

Solution: Atomic operations, explicit locking (SELECT FOR UPDATE), or automatic detection + abort

Read Skew (Non-repeatable Read)

Fixed by: Snapshot Isolation

Transaction reads the same row twice and sees different values because a concurrent transaction committed between the reads.

EXAMPLE: Backup reading rows one by one โ€” some rows already updated by a concurrent transfer while others are not yet.

Solution: MVCC โ€” transaction reads from a frozen snapshot taken at its start

Write Skew

Fixed by: Serializable only

Two transactions each read the same objects, make a decision based on the read, then write to different objects โ€” violating a multi-object constraint.

EXAMPLE: Two on-call doctors both check "is another doctor on call?" (both see yes), both go off call โ€” now nobody is on call.

Solution: Serializable isolation (SSI, 2PL) or explicit row locking (SELECT FOR UPDATE)

Phantom

Fixed by: Serializable only (snapshot isolation avoids in read-only queries but not read-write transactions)

A transaction executes a search query, then a concurrent transaction inserts/deletes rows matching that search condition, changing the result of the query.

EXAMPLE: Checking "no conflicting meeting room booking exists" โ€” another transaction inserts a conflicting booking concurrently.

Solution: Predicate locks or index-range locks (2PL), SSI detects conflicts at commit time

Write Skew in Practice

Write skew follows a pattern: (1) SELECT query checks a precondition, (2) application decides to proceed, (3) UPDATE/INSERT changes data that affects the precondition. Real examples include: two doctors simultaneously going off-call, double-booking a meeting room, two users claiming the same username, preventing double-spending of credits. The pattern is called a phantom when the precondition involves the absence of rows matching a condition โ€” you can't lock rows that don't exist yet. Materializing conflicts (pre-creating lock rows) is a hack of last resort; proper serializable isolation is the right fix.

Preventing Lost Updates

The lost update problem is the most common write-write conflict: two transactions both execute a read-modify-write cycle on the same object, and the second write overwrites the first without incorporating it. Several approaches are available at different levels of the stack.

โš›

Atomic Database Operations

Most databases provide atomic increment/update operations (e.g., UPDATE counter SET value = value + 1). These are concurrency-safe because the database serializes them. Best when the operation can be expressed as a single statement. ORMs often accidentally bypass these.

๐Ÿ”’

Explicit Locking (FOR UPDATE)

Use SELECT ... FOR UPDATE to lock the rows you intend to modify. Other transactions must wait until you commit. Required when the update logic cannot be expressed as a single atomic operation (e.g., game move validation). Risk: easy to forget a lock, introducing subtle race conditions.

๐Ÿ”

Automatic Detection

PostgreSQL, Oracle, and SQL Server's snapshot isolation automatically detect lost updates and abort the offending transaction. Application must retry aborted transactions. MySQL/InnoDB's repeatable read does not detect lost updates โ€” it silently allows them.

๐Ÿ”„

Compare-and-Set (CAS)

Only update if the value matches what you last read: UPDATE ... WHERE value = [old_value]. If another transaction changed the value, the update has no effect and must be retried. Note: under MVCC, the WHERE clause may be evaluated against a snapshot that doesn't see the concurrent change โ€” implementation-dependent.

Serializability: The Gold Standard

Serializable isolation is the strongest isolation level โ€” it guarantees that the database behaves as if all transactions executed one at a time in some serial order, even if they actually ran concurrently. It prevents all the anomalies in the taxonomy above, including write skew and phantoms.

Three implementation strategies exist, with very different performance and operational characteristics.

Actual Serial Execution

HOW IT WORKS

Execute one transaction at a time on a single thread. Transactions are submitted as stored procedures. No concurrency at all.

THROUGHPUT

โœ“ High throughput if transactions fit in RAM and are short. Bottleneck: single CPU core.

TRADEOFFS

Requires all data in memory. No interactive multi-statement transactions. Cross-shard transactions are very slow.

Examples: VoltDB/H-Store, Redis (single-threaded), Datomic

Two-Phase Locking (2PL)

HOW IT WORKS

Phase 1 (growing): transactions acquire shared/exclusive locks on objects. Phase 2 (shrinking): release locks only after commit/abort. Writers block both readers and writers.

THROUGHPUT

โŒ Lower โ€” lock contention, deadlocks (detected and aborted), latency spikes

TRADEOFFS

Proven, well-understood. Deadlock risk. Shared locks allow concurrent reads. Predicate locks / index-range locks needed for phantoms.

Examples: MySQL InnoDB serializable, IBM Db2 repeatable read, old SQL Server serializable

Serializable Snapshot Isolation (SSI)

HOW IT WORKS

Optimistic: transactions proceed without blocking using MVCC snapshots. At commit time, checks for serialization conflicts (premise violations โ€” stale read assumptions). Aborts if conflict detected.

THROUGHPUT

โœ“ Higher โ€” no blocking. Reads never block writes and vice versa.

TRADEOFFS

Abort rate increases under high contention. More recent algorithm (Cahill 2008). Requires tracking read/write dependencies.

Examples: PostgreSQL (since 9.1), CockroachDB, YugabyteDB

SSI: How Premise Violation Detection Works

SSI builds on snapshot isolation. When a transaction reads data, the database records what the transaction read (the "premise" of subsequent writes). If, at commit time, another transaction has modified data that was read (invalidating the premise), the committing transaction is aborted. Two patterns trigger aborts: (1) a transaction reads a stale snapshot and writes based on that stale read; (2) a transaction reads data that was written after the snapshot by a concurrent transaction (stale detection on writes). Unlike 2PL, SSI never blocks โ€” it only aborts at commit time if needed.

Mental Models

Transactions are Safety Nets, Not Performance Killers

The instinct to disable transactions for speed is usually wrong. The real performance enemies are long-running transactions and cross-shard coordination โ€” not transactions themselves. Modern SSI (PostgreSQL) has very low overhead at typical contention levels.

ACID is Not a Spectrum โ€” Isolation Is

"We use ACID" often means nothing more than read-committed isolation. The four ACID properties (Atomicity, Consistency, Isolation, Durability) are real, but isolation is the dimension that actually varies. Always ask: which isolation level?

MVCC: Readers and Writers Never Block Each Other

Snapshot isolation's key insight: maintain multiple versions of each row, let each transaction read from its own consistent snapshot. Writes create new versions. Garbage collection removes versions no longer visible to any active transaction.

Write Skew is the Hardest Anomaly to Reason About

Unlike lost updates (two writes to the same object), write skew involves two transactions reading the same set of objects and writing to different objects. The constraint violation is invisible to both transactions individually โ€” it emerges from their interaction.

SSI vs 2PL: Optimistic vs Pessimistic Concurrency

2PL assumes conflicts will happen and prevents them upfront (pessimistic). SSI assumes conflicts are rare and detects them at commit (optimistic). Under low contention, SSI wins. Under high contention with many aborts, 2PL may be preferable.

Serial Execution is Underrated

VoltDB proved that a single-threaded serializable database can outperform a multi-threaded one under OLTP workloads when data fits in RAM and transactions are short. The coordination overhead of concurrency can exceed its benefits.

DESIGNING DATA-INTENSIVE APPLICATIONS ยท 2ND EDITION

This article is part of an ongoing series distilling DDIA 2nd ed. into structured, audience-aware reading notes. The series covers storage engines, replication, sharding, transactions, distributed systems, and stream processing.

โ† Ch.7: ShardingCh.9: The Trouble with Distributed Systems โ†’