SRE, DBRE

SQL Server: Accelerated Database Recovery

Have you ever thought that your SQL server takes too long to start up? Or when using SQL Availability Group and switching to another node for maintenance, did it also cause lengthy downtime? I’ve been there! So, in this post, I want to explore what’s causing these problems.

When the SQL engine restarts after a crash, it tries to recover the database to its state at the time of the crash. This recovery process also happens during a failover. Let’s first understand the recovery process of the SQL server.

sql recovery process

Well, there are so many definitions here that need to be understood before going deeper. Let’s break it down!

  • Logging: Every transaction that hits the SQL server will be logged into the transaction log file before saving into the database file.
  • Committed transaction: It’s just as simple as it sounds. BEGIN TRANSACTION … COMMIT TRANSACTION.
  • Uncommitted transaction: Yeah, simple again. BEGIN TRANSACTION … (and then the crash or failover happened).
  • Dirty data/page/checkpoint: The SQL engine (or any database engine) typically does not interact with the data on disk directly. Instead, it uses a buffer cache in memory to hold frequently accessed data pages. When it needs to read or modify data, it will first try to find the required data in the buffer cache. If the data is already in memory (cached), it will read or modify it there, and this data in memory is referred to as “dirty data”. If, the required data is not present in the buffer cache (cache missed), the SQL engine will then fetch the data from the disk into the buffer cache. After that, it will read or modify the data in the buffer cache as before.
  • Checkpoint: The changes data in the buffer cache (dirty data) are not immediately written back to the disk. Instead, most database systems implement a mechanism known as “Write-ahead logging (WAL)”. This means that any changes made to the data in memory are first written to a log file on disk. Periodically, or when certain conditions are met, the database engine will write all the dirty data from the buffer cache back to the disk.

Now, here is a summary of the recovery process:

  1. Analysis: This is the process where the SQL server will analyze the current status of the server/database to decide the appropriate next steps. What does it do?
    • First, from the last checkpoint, it will read forward to determine the state of each transaction:
      • Committed transactions not yet written to disk => redo (persist to disk).
      • Uncommitted transactions => undo (revert the changes).
  2. Redo: Starting from the oldest uncommitted transaction, it reads forward to the end of the log and commits any transactions that were committed to the log but not the database file.
  3. Undo: It starts from the end of the log, reads backward, and rolls back any transactions that were still open or weren’t committed to the log.

hmm, So what is the limitation of this process ? Well, continue reading. As you might imagine, the time needed to restore databases to their state when the system crashed relies heavily on the server’s workload during that time. In simpler terms, the recovery duration is linked to the amount of work done by ongoing transactions and the time they were active. Consider a scenario where there are large transactions running. If a failover occurs (either you initiate it or someone else in your team does), it can lead to application downtime because the recovery process might take a significant amount of time (eg: 10 minutes! ) to complete.

Well, now we will see how the Accelerated Database Recovery (ADR) helps in this case.

How does ADR works?

sql recovery process with ADR What does ADR bring to the table ?

  • Avoid having to scan the log from/to the beginning of the oldest active transaction => recovery time is not impacted by the long running transactions.
  • The transaction log can be truncated aggressively. Because ADR does not need to process the log for the whole transaction.
  • Reduce contention in the tempdb DB. Because the row verion will be stored on the user databases.

Sounds interesting!, Let’s get farmiliar with new concept in ADR first.

  • Persistent Version Store (PVS): The purpose here is simple: the SQL server aims to eliminate the need to read too much data inside the transaction log file. However, how does the system redo/undo the operations if it doesn’t have a history of the operations executed on the system? That’s where the PVS comes into play.
    • When a transaction modifies a row in a table, the SQL server generates a versioned copy of the modified row and stores it in the PVS. This version is then stored in the user database.
    • By having versioned rows, when the SQL server needs to redo/undo certain transactions, it does not need to read through large transaction logs.
  • Logical Revert:
    • When a database crash occurs, Logical Revert uses the versions of the rows in the PVS to undo the changes made by the transactions that were active at the time of the crash. This approach allows the database to be recovered much faster than by reading through the transaction log.
    • But… if we have multiple versions of a row, how does the SQL server track and determine which is the correct version to be used when the user needs it? This is where a component named Aborted Transaction Map comes into play. The SQL server holds information about aborted transactions in this map. When a query needs to read a row, it checks whether the row was modified by an aborted transaction. Based on the the result, it decides whether this version is visible or not.
    • This component can also be used to undo changes made by any transaction, especially helpful in reducing the time needed to rollback long-running transactions.
  • SLOG: There are two types of operations in SQL Server: operations that change the data are called versioned operations, and other operations that do not change data (such as metadata cache invalidation, lock acquisitions, etc.) are called non-versioned operations. In the traditional recovery process, all operations are logged in the transaction log. This includes both versioned and non-versioned operations. When a database is recovered, all the transaction log records are processed, regardless of whether they are for versioned or non-versioned operations.
    • ADR introduces a new concept called SLOG to log all non-versioned operations. This SLOG is of low volume, in-memory, and will be persisted to disk during the checkpointing process.
    • Because the SLOG only holds the non-versioned operation records, it’s small. Reading the small files is quicker than reading the large transaction log file; that’s the point.
  • Cleaner
    • This is an asynchronous process that periodically removes old, unused versions in the PVS.

Now, let’s see how of those component works together in the new recovery process.