MVCC
Multi-Version Concurrency Control
The DBMS maintains multiple physical versions of a single logical object in the database. When a transaction writes to an object, the DBMS creates a new version of that object. When a transaction reads an object, it reads the newest version that existed when the transaction started.
The key properties: Writers don’t block the readers. Readers don’t block the writers. Read-only transactions can read a consistent snapshot without acquiring locks. Timestamps are used to determine visibility. It supports time-travel queries.
Version storage will help DBMS decide how to store different physical versions of a logical object. The DBMS uses the tuple’s pointer field to create a version chain per logical tuple. Indexes always point to the head of the chain. A thread traverses chain until you find the version thats visible to you.
- Append-Only Storage – New versions are appended to the same table space.
- Oldest-To-Newest (O2N): Append new version to end of chain, look-ups require entire chain traversal.
- Newest-To-Oldest (N2O): Head of chain is newest, look-ups are quick, but indexes need to be up- dated every version.
- Time-Travel Storage – Old versions are copied to separate table space.
- Delta Storage – The original values of the modified attributes are copied into a separate delta record space.
Garbage Collection: The DBMS needs to remove reclaimable physical versions from the database over time.
- Tuple Level Garbage Collection – Find old versions by examining tuples directly
- Background Vacuuming: Separate threads periodically scan the table and look for reclaimable ver- sions, works with any version storage scheme.
- Cooperative Cleaning: Worker threads identify reclaimable versions as they traverse version chain. Only works with O2N.
- Transaction Level – Each transaction keeps track of its own read/write set. When a transaction completes, the garbage collector can use that to identify what tuples to reclaim.
Index Management: All primary key (pkey) indexes always point to version chain head.
- Logical Pointers – Use a fixed identifier per tuple that does not change. Requires an extra indirection layer that maps the logical id to the physical location of the tuple
- Physical Pointers – Use the physical address to the version chain head