A-Z of In-Memory OLTP : Checkpoint – Part 1
Posted by blakhani on December 19, 2013
In today’s advance world, RAM is still volatile. Once you pull the power plug out, all the content of RAM is flushed. Since we are talking about in-memory tables, what would happen to data in tables? If we have defined a table as durable (SCHEMA_AND_DATA) in SQL Server, durability has to be maintained. During recovery of the database (also known as crash recovery), all the content should come back. To achieve this, SQL Server does logging into transaction log and writes checkpoint for in-memory tables.
If you are a SQL Server DBA and worked with earlier versions of SQL, checkpoint is definitely known to you. You can read more about earlier version of checkpoint in books online. In short, checkpoint is a process which writes dirty pages from memory to data file. This is done to reduce the recovery time of the database during crash recovery. Checkpoint can be done by system thread based on recovery interval or triggered on demand when system is low on memory, or other operations such as log backup etc.
“Similar” concept of checkpoint exists for In-Memory tables as well. I quoted similar because mechanism is different than the known checkpoint process for disk-based tables. To maintain durability of data for in-memory tables, SQL Server does logging of changes in transaction log (which is also different than disk based tables) and writes information during checkpoint. In this blog, we would try to understand the various things which happens during checkpoint. We should remember that transaction logging is sufficient to recover the data but if there is no checkpoint then we need to read complete transaction log during crash recovery, which would be inefficient. So, periodically checkpoint is fired and data is dumped into checkpoint files. Now, during recovery, we recover from checkpoint and replay only that portion of transaction log which is after checkpoint.
Checkpoint files are created into filegroup MEMORY_OPTIMIZED_DATA which is container for these files. During checkpoint, we try to use log buffer in memory to write information into checkpoint files. There are chances that log buffer is not having sufficient log blocks and then we would end-up in reading transaction log file. SQL Server uses FILESTREAM as underlying storage mechanism for checkpoint files. The checkpoint information is written into the disk by checkpoint streams, which comes in two flavors:
- DATA streams:
- These file are approximately 128 MB in size.
- They contain only inserted rows of the table in the given timestamp range.
- DELTA streams:
- These are paired (1:1) with DATA files.
- Size of these file is not constant.
- Store the deleted rows with IDs.
Note that both files are “append only” files which means the IO would always be sequential. As soon as table is created, DATA files are pre-allocated with 128 MB size. DATA and DELTA files form a pair for a timestamp range. Once current DATA file becomes full, SQL Server would switch to new DATA file.
Enough of theory? Let’s see a checkpoint in action.
For demo purpose, I am creating table in the database which is having wide row. Here is the complete script.
-- Create database having container C:\IMO_Database\SQLSeverHelp_IMO_dir CREATE DATABASE SQLSeverHelp_IMO ON PRIMARY ( NAME = [SQLSeverHelp_IMO_data] ,FILENAME = 'C:\IMO_Database\SQLSeverHelp_IMO.mdf' ) ,FILEGROUP [SQLSeverHelp_IMO_FG] CONTAINS MEMORY_OPTIMIZED_DATA ( NAME = [SQLSeverHelp_IMO_dir] ,FILENAME = 'C:\IMO_Database\SQLSeverHelp_IMO_dir' ) LOG ON ( NAME = [SQLSeverHelp_IMO_log] ,Filename = 'C:\IMO_Database\SQLSeverHelp_IMO_log.ldf' ) GO -- Create In-Memory table Use SQLSeverHelp_IMO GO CREATE TABLE MyFirstMemporyOptimizedTable ( iID INT NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 100000), vFName CHAR(4000) NOT NULL, vLName CHAR(4000) NOT NULL ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA) GO
As soon as table is created, we would see DATA and DELTA file in the container. We can use DMV sys.dm_db_xtp_checkpoint_files to see the information. Let’s first run a very basic query to understand the files.
SELECT file_id, pair_file_id, file_type_desc, is_active FROM sys.dm_db_xtp_checkpoint_files ORDER BY is_active DESC;
Here is the output with some highlights. I have put the same color for the matching file_id with pair_file_id column for clarity.
As we can see in output, DATA and DELTA files are having pair. I have highlighted only two pairs but if you look carefully, all files are paired. Note that the number of files created is a function of number of CPU cores on the system. My machine has 8 logical CPU so I am seeing 9 pairs. Is_Active column represents whether the files is currently in use or not. Another way to find out paired files is to use column called internal_storage_slot. Only those files which are having not null slot would be used by checkpoint.
SELECT internal_storage_slot, file_id, pair_file_id, file_type_desc FROM sys.dm_db_xtp_checkpoint_files WHERE internal_storage_slot is NOT NULL
Here is the output. Two files having same slot form a pair of DATA-DELTA (slot 3 in below screenshot).
Here is how the physical files on operating system look like:
Now, you may ask – how would I associate the file in the OS with the file_id and pair_file_Id column in sys.dm_db_xtp_checkpoint_files? Let’s use other columns of DMV to explore that information.
I have tried to make the picture self-explanatory. Highlight with the same color are the matching texts.
Now, lets try and check what happens when we insert some rows in the table. As we discussed earlier, DATA file would contain the rows which are inserted and DELTA file would have entries of deleted rows. I have used below script to load ~10K rows into the table.
DECLARE @counter INT SET @counter = 0 WHILE @counter <= 10000 BEGIN INSERT INTO myfirstmemporyoptimizedtable VALUES (@counter, 'SQLServer-Help.com', 'Balmukund Lakhani'), (@counter + 1, 'ExtremeExperts.com', 'Vinod Kumar M'), (@counter + 2, 'SQLAuthority.com', 'Pinal Dave') SET @counter = @counter + 3 END
Now, Let’s go ahead and query the DMV again to see if we have something interesting! But Wait! We don’t know if checkpoint has automatically fired or not. So before running the query, I fired manual checkpoint via “CHECKPOINT” command in management studio.
SELECT internal_storage_slot, file_id, pair_file_id, file_type_desc, file_size_in_bytes / 1024.0 / 1024.0 [File Size in MB], file_size_used_in_bytes / 1024.0 / 1024.0 [File Used in MB], total_bytes / 1024.0 / 1024.0 [Total MB], inserted_row_count FROM sys.dm_db_xtp_checkpoint_files WHERE total_bytes > 0 -- non zero size and internal_storage_slot is not null
As we can see that at this point only one DATA file is having non-zero size and number of rows shown are 10002, which were inserted earlier.
As we insert more and more rows, data would be accumulated in DATA file and once its full, next file would be opened used and current file becomes read-only. Please remember that 128 MB is NOT the limit of the checkpoint DATA file. A transaction can’t span across two DATA files, so a bigger transaction could cause a file to go beyond 128 MB size. For demo purposes, instead of 10K rows, I have inserted ~40K rows in one transaction.
If you are wondering why we are not seeing any DELTA file in any of the query output then recall that DELTA file would contain the DELETEed rows identifier and so far I have been doing only INSERTs. To give you some time to grasp information in this blog, I would stop here and we would discuss other aspects of checkpoint in next blog for this A-Z series.
In case you have any question, post them in comment section and I would be happy to answer them.