Understand PG's MVCC Visibility Check Rules

1. Introduction

PostgreSQL’s MultiVersion Concurrency Control (MVCC) is an “advanced technique for improving database performance in a multi-user environment” according to Vadim Mikheev. This technique requires multiple “versions” of the same data tuple exist in the system governed by snapshots taken during different time periods. In other words, under such technique, it is PG’s responsibility to figure out how many tuples are visible to the user and how many are not, according to multiple parameters such as snapshots taken, current transaction ID…etc. This is also known as visibility check rules in PG and today in this blog, I will talk about the basic principles of visibility check so you get an idea how PG performs this task internally and I hope it would be helpful in your development work.

2. Parameters Involved?

  • The tuple itself, which contains:
    • xmin (the transaction ID that inserts this tuple)
    • xmax (the transaction ID that deletes this tuple if > 0, otherwise it’s not deleted)
    • cid (command ID)
    • hintbit
  • Global top transaction ID if available
  • current snapshot, which also contains xmin, xmax and cid
  • Commit Log Data (CLOG)

3. The Process

3.1 Check the Hintbit

The visibility process starts by checking the hintbit. If hintbit has a state of COMMITTED, then we can skip most of the other visibility checking rules for better efficiency. With the tuple committed, PG then take this tuple’s xmin value to check with the current snapshot to ensure this tuple is not currently in progress. (See 3.3 below for formula to determine in progress). This check is required because there is a chance that this tuple, though committed, is still being updated by other backend processes at this very moment. If the tuple is not currently in progress by other backends, it will finally check its xmax value to ensure that it is invalid. Invalid means it has not been deleted. When all of the above are true, this tuple is considered visible.

If hintbit indicates ABORTED, then this tuple is considered invisible to user. If hintbit has no value, then PG should continue with the rest of visibility check rules.

3.2 Check If Tuple’s xmin Equal to Global Top Transaction ID

The next check involves taking tuple’s xmin and compare with global top transaction ID to see if they are equal. This global top transaction ID will only be set when user starts the transaction manually with begin clause. If user does not starts the transaction this way (no begin clause is issued), then this global top transaction ID will not be set and therefore this checking will be skipped.

If tuple xmin equals global top transaction ID, this means the tuple is currently in progress by the current backend, not others. This is where the command ID (cid) will be used to determine the visibility. Within a transaction block, each command issued has an associated command ID to indicate what command comes first and what comes later. For example, if a SELECT comes after an UPDATE, within a transaction, then the SELECT must see the new tuple updated by the previous UPDATE clause and cid plays a part in this determination. This behavior is also governed by the isolation level which is unfortunately out of scope of this blog. The default isolation level is READ COMMITTED, which makes the SELECT sees the data changed by previous UPDATE; but if the isolation is set to REPEATABLE READ then the SELECT will not see the data changed by previous UPDATE. Please keep this in mind.

If tuple’s xmin does not equal to global top transaction ID, then PG should continue with the rest of visibility

3.3 Check the Current Snapshot

The next check involves the PG to check if this tuple is currently in progress by other backend processes. This is done by comparing the tuple’s xmin value against current snapshot’s xmin and xmax values according to this formula.

  • tuple xmin < snapshot xmin = not in progress
  • tuple xmin >= snapshot xmax = in progress

If this tuple is considered in progress by the snapshot, then this tuple must not be visible to the current user because other backends are still working on it. If this tuple is NOT considered in progress, then PG should continue with the rest of visibility check rules.

3.4 Check the Commit Log (CLOG)

This next check involves the PG to take tuple’s xmin to check against the CLOG to see if this tuple has been committed or aborted. CLOG is like an array of transaction IDs and each array element stores a commit status. PG has a formula to convert a transaction ID into a CLOG block plus an offset to precisely access the right CLOG element. This CLOG data structure is regularly flushed to disk with checkpoint process and it is located in these 3 directories: pg_xact, pg_multixact and pg_subtrans.

If CLOG says the given tuple is committed, then PG will continue to check the tuple’s xmax value to ensure that it is invalid. Invalid means it has not been deleted and therefore it is visible to user. At the same time, PG will also update the hintbit value of this tuple to COMMITTED such that in the next visibility checking, PG will not access CLOG again, which is rather expensive.

If CLOG says the given tuple is aborted or invalid, then this tuple is not visible to the current user. At the same time, it will also update the hintbit value of this tuple to INVALID

Summary

The above processes are the very basic visibility rules within PG according to my understanding. Of course there are other complex checking not mentioned here, such as the involvement of sub transactions and multi transactions, but they do follow somewhat similar pattern to what’s mentioned today. I hope this blog would be helpful for you.

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×