DDIA gives the entire context behind current systems and patterns, design considerations and therefore the whys to every critical pattern, structure, offering, capability, as well as pros vs cons
Important concepts like serializability vs materializing conflicts
Race conditions in distributed databases
Religiously reading through this bible helps you to stop abstracting away the important concepts, concepts and details in distributed software and finally start understanding how to address common points of failure.
Contention in a database workload refers to a situation where multiple transactions or queries compete for the same resources, leading to performance bottlenecks. This can happen due to:
- Lock Contention – Multiple transactions try to access the same row, table, or index, causing delays due to waiting for locks to be released.
- CPU Contention – Too many queries compete for CPU resources, slowing down execution.
- I/O Contention – Queries overload disk or network I/O, leading to slow reads/writes.
- Memory Contention – Insufficient memory for caching or query execution, causing excessive disk swapping.
Reducing contention often involves indexing, query optimization, workload balancing, or using concurrency control mechanisms like MVCC (Multiversion Concurrency Control).
Detailed Explanation of Locks and Related Concepts
-
Shared Locks (S Locks)
- A shared lock allows multiple transactions to read a resource (e.g., row, page, or table) but not modify it.
- Other transactions can also acquire a shared lock simultaneously, ensuring consistent reads.
- However, if a transaction wants to modify the resource, it must wait until all shared locks are released.
- Used in
SELECT ... FOR SHAREqueries in some databases.
-
Exclusive Locks (X Locks)
- An exclusive lock prevents all other transactions from accessing the locked resource in any way (neither reading nor writing).
- This ensures no other transaction can interfere while the locking transaction makes changes.
- Typically applied during
INSERT,UPDATE, orDELETEoperations. - Used in
SELECT ... FOR UPDATEqueries in some databases.
-
Predicate Locks
- A special type of lock that applies to a condition or range of values rather than specific rows.
- Used in Serializable Isolation to prevent phantom reads (i.e., new rows appearing unexpectedly in a transaction).
- For example, if a transaction runs
SELECT * FROM orders WHERE amount > 1000;, a predicate lock would prevent new rows from being inserted that match this condition (amount > 1000). - Implemented in Serializable Snapshot Isolation (SSI) in some databases.
-
Phantom Reads & Phantom Locks
-
A phantom read occurs when a transaction reads a set of rows, but a concurrent transaction inserts or deletes rows that match the same query criteria, leading to inconsistent results.
-
Example:
- Transaction A:
SELECT * FROM employees WHERE salary > 50000; - Transaction B:
INSERT INTO employees (id, name, salary) VALUES (10, 'Alice', 60000); - Transaction A runs the same query again and sees a new row that wasn’t there before.
- Transaction A:
-
Phantom locks (or predicate locks) prevent phantom reads by locking ranges of data rather than individual rows.
-
Summary Table
| Lock Type | What It Does | Allows Concurrent Reads? | Allows Concurrent Writes? | Use Case Example |
|---|---|---|---|---|
| Shared Lock (S) | Allows multiple transactions to read a resource | ✅ Yes | ❌ No | SELECT ... FOR SHARE |
| Exclusive Lock (X) | Prevents all other transactions from accessing the resource | ❌ No | ❌ No | UPDATE or DELETE |
| Predicate Lock | Locks a condition (e.g., amount > 1000) instead of specific rows | ✅ Yes (if condition does not change) | ❌ No (prevents inserts that match the condition) | Ensuring serializable isolation |
| Phantom Reads | A read anomaly where new rows appear during a transaction | Can occur | Can occur | Running the same query twice and getting different results due to new rows |
| Use Case / Priority | Use Functional Programming (FP) | Use Object-Oriented Programming (OOP) |
|---|---|---|
| Data transformation / pipelines | ✔️ map/filter/reduce style fits well | ❌ Not ideal |
| Concurrency / parallelism | ✔️ Stateless, easier to parallelize | ⚠️ Requires careful state handling |
| Immutability / predictability | ✔️ Core feature | ❌ Must be enforced manually |
| Modeling real-world entities | ❌ Less intuitive | ✔️ Natural fit |
| Large, stateful applications | ⚠️ Can be complex | ✔️ Designed for this |
| Code reuse | ✔️ Via composition | ✔️ Via inheritance/polymorphism |
| Testability | ✔️ Pure functions = easier tests | ⚠️ More mocks/stubs needed |
🔹 Use FP when you're doing data-heavy, stateless, or concurrent logic. 🔹 Use OOP when you're modeling real-world systems with entities and state. 🔹 Use both when your language allows it — they complement each other well.
More to come...