MVCC

Multi-Version Concurrency Control

The DBMS maintains multiple physical versions of a single logical object in the database. When a transaction writes to an object, the DBMS creates a new version of that object. When a transaction reads an object, it reads the newest version that existed when the transaction started.

The key properties: Writers don’t block the readers. Readers don’t block the writers. Read-only transactions can read a consistent snapshot without acquiring locks. Timestamps are used to determine visibility. It supports time-travel queries.

Version storage will help DBMS decide how to store different physical versions of a logical object. The DBMS uses the tuple’s pointer field to create a version chain per logical tuple. Indexes always point to the head of the chain. A thread traverses chain until you find the version thats visible to you.

  • Append-Only Storage – New versions are appended to the same table space.
    • Oldest-To-Newest (O2N): Append new version to end of chain, look-ups require entire chain traversal.
    • Newest-To-Oldest (N2O): Head of chain is newest, look-ups are quick, but indexes need to be up- dated every version.
  • Time-Travel Storage – Old versions are copied to separate table space.
  • Delta Storage – The original values of the modified attributes are copied into a separate delta record space.

Garbage Collection: The DBMS needs to remove reclaimable physical versions from the database over time.

  • Tuple Level Garbage Collection – Find old versions by examining tuples directly
    • Background Vacuuming: Separate threads periodically scan the table and look for reclaimable ver- sions, works with any version storage scheme.
    • Cooperative Cleaning: Worker threads identify reclaimable versions as they traverse version chain. Only works with O2N.
  • Transaction Level – Each transaction keeps track of its own read/write set. When a transaction completes, the garbage collector can use that to identify what tuples to reclaim.

Index Management: All primary key (pkey) indexes always point to version chain head.

  • Logical Pointers – Use a fixed identifier per tuple that does not change. Requires an extra indirection layer that maps the logical id to the physical location of the tuple
  • Physical Pointers – Use the physical address to the version chain head