Help: SQL Server

Sharing my knowlege about SQL Server Troubleshooting Skills

Archive for December, 2013

A-Z of In-Memory OLTP : Database Startup and Recovery Process

Posted by blakhani on December 31, 2013


“If you are a seasoned driver you have a number of parameters to check before you start your vehicle. It can range from a simple task like if all the doors are locked and there is no indicator, or if the safety seatbelt signs are off or even more critical can be activities like if the fuel tank readings are showing us values that is satisfactory keeping our destination in mind. If reaching a destination using your vehicle can be this strenuous task that we do it with such ease, the process of SQL Server startup also goes through a simple steps of readings that we can start assuming. Adding an In-Memory dimension to an already complex process is always going to be a learning by itself.”  These words are from Vinod (b|t) when I asked him to give me database startup co-relation in real life. Hats off to you Vinod!

Yes, in this blog post. we would be discussing about database startup and recovery for in-memory tables. Database startup is also known as recovery of database. Whenever SQL Server (re)starts, all databases go though start-up subroutine. Below are the situations where database startup or recovery is performed.

  • Restart of SQL Service
  • Restore from a backup.
  • Failover to Mirror or AlwaysOn secondary.
  • ALTER DATABASE SET ONLINE
  • Failover of SQL Server in cluster.

In traditional databases having only disk based tables, the database startup/recovery has three phases: analysis, roll-forward (redo) and rollback (undo). These three phases are well documented in many places. Ref1, Ref2. In this blog, we will talk about startup of databases which have in-memory tables.

Since we are talking about in-memory tables, one of the step during database startup would be to load data for all in-memory tables into memory (RAM). In earlier post, we have talked about DLLs of in-memory table and learned that they are generated during database startup. We also discussed here that indexes information is not logged into transaction log and data files, hence index needs to rebuild during loading of data. So, to summarize, here are the major step of recovery of database having in-memory tables.

  • Generate DLLs of in-memory tables
  • Load data into the memory using checkpoint files (DATA & DELTA files) and transaction log file of database.
  • Create index which are defined on in-memory tables.

If any of above steps fail, the database recovery fails and the database would not come online. Let’s verify few theoretical points via demo.

Generate DLLs of in-memory tables

To demonstrate this, I would go ahead and take our database offline using below command

Use master
go
Alter database SQLSeverHelp_IMO set offline with rollback immediate
go

Once database is offline, let’s look at folder which had DLLs for in-memory tables. On my this SQL Instance there is no other database having in-memory tables.

image 

As we learned earlier, this folder generally has folder for DatabaseID and each of those would have DLLs for the objects in respective database. On my SQL instance, at this point there is no database which is online and having in-memory object, so this folder is empty. Now, let’s bring the database online.

Alter database SQLSeverHelp_IMO set online
go

The database id for SQLServerHelp_IMO is 8 for my SQL Instance, and here is what I see once I bring database online.

image

You may ask, what about the natively compiled stored procedure’s DLL. Well, they would be generated when the stored procedure would be first executed (compiled automatically)

Load data into the memory

As explained earlier, this is also a different step as compare to traditional disk based table. Once database is started, we need to load the data using checkpoint files and transaction log of database. In my previous blog of checkpoint (part 1 and part 2), I have explained about DATA and DELTA file. 

Here is the pictorial representation of data loading process. Image Reference http://msdn.microsoft.com/en-us/library/dn133174(v=sql.120).aspx

image

There are three phases – we can call them as analysis, data load and redo. Note that there is no undo phase because in-memory tables doesn’t log uncommitted data into the transaction log (refer here). During analysis phase, transaction log is read and last checkpoint is located. In data load phase, we first load the DELTA files into memory which contains the pointer for the rows which are deleted. This creates a “delta map” for each delta file. Then data file is loaded into memory and filter is applied. Only those rows are loaded into memory which are not deleted. The loading of data happens in parallel (that’s why there are multiple 1’s and 2’s in image). You can imagine that if we have slow disk sub-system, the time taken by database to come online would be longer.

Create index which are defined on in-memory tables

As we have discussed earlier, only index definition is persisted in physical files. Indexes are only available in-memory and that’s why there is no transaction information is logged in physical files. Once data is loaded and DLLs are generated, next step is to generate index on the data which is loaded into memory. We would talk about indexes in later part of this series.

Once all three steps are completed, database would be made available. If there is any system problem and we are enable to load the data in memory or failed to generated the DLL, SQL database recovery would fail and database would go to “Recovery Pending” state. It’s worth mentioning that if our database has in-memory tables as well as disk based tables and database recovery fails, we would not be able to access disk based tables as well.

In my next blog, I would take up scenarios of recovery failure and show you steps to troubleshoot them. Start tuned!

By the way, this is going to be last post of year 2013 so with you a happy near year and see you next year on another blog post in A-Z of In-Memory OLTP series.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle
  • Posted in A - Z Series, In Memory OLTP, In-Memory OLTP, SQL Server 2014 | Tagged: , , , , , , , , , | 6 Comments »

    A-Z of In-Memory OLTP : How In-Memory Table is made durable – Transaction Logging Concepts

    Posted by blakhani on December 26, 2013


    If you ever visit a bank and do some financial transaction, you would notice that bank employee would note the information about the transaction in some format (paper based or electronic based). In the same way for SQL Server logging of transaction is needed. Hence, transaction log (also known as T-Log) file is an integral part of database. For the RDBMS to be durable, logging of transaction is done before making any modification. If an in-memory table is defined as “durable” then logging has to be done for the table. The logging of transaction is done differently for in-memory tables and that’s why I am dedicating a blog post only on this topic in A-Z Series.

    There are few major differences between transaction logging of disk based tables versus in-memory tables. I would list them out here:

    • Multiple changes in table are merged into single log record.
    • Log records are not generated for any modification of Index. The index on the table is rebuild during startup of the database.
    • Transaction log records are written only if the transaction on in-memory table is committed. This would mean that no UNDO information would be logged.

    Due to all above, the amount of information logged into transaction log is less and also efficient as compared to disk based tables. This makes in-memory OLTP scalable and high performance engine.

    Let’s understand each point by demo(s).

    Multiple changes in table are merged into single log record

    To check this behavior, we could create two identical tables – Disk based and In-Memory table. Here is the script which can be used. This is the same script which you might have seen earlier in this blog.
    -- Create database with IMO Filegroup, If exists drop it.
    Use Master
    go
    If db_id('SQLSeverHelp_IMO') is NOT NULL
    drop database SQLSeverHelp_IMO
    GO
    CREATE DATABASE SQLSeverHelp_IMO 
        ON PRIMARY (
        NAME = [SQLSeverHelp_IMO_data]
        ,FILENAME = 'C:\IMO_Database\SQLSeverHelp_IMO.mdf'
        )
        
        ,FILEGROUP [SQLSeverHelp_IMO_FG] CONTAINS MEMORY_OPTIMIZED_DATA ( 
        NAME = [SQLSeverHelp_IMO_dir]
        ,FILENAME = 'C:\IMO_Database\SQLSeverHelp_IMO_dir'
        ) 
        
        LOG ON (
        NAME = [SQLSeverHelp_IMO_log]
        ,Filename = 'C:\IMO_Database\SQLSeverHelp_IMO_log.ldf'
        ,SIZE = 102400KB 
        ) 
    GO
    
    
    -- Create table in database
    
    -- Normal (disk based) table
    Use SQLSeverHelp_IMO 
    go
    CREATE TABLE DiskBasedTable
    (
    iID INT NOT NULL PRIMARY KEY NONCLUSTERED,
    vFName CHAR(20) NOT NULL,
    vLName CHAR(20) NOT NULL
    )
    GO
    
    -- In-Memory table
    Use SQLSeverHelp_IMO
    GO
    CREATE TABLE MyFirstMemporyOptimizedTable
    (
    iID INT NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 100000),
    vFName CHAR(20) NOT NULL,
    vLName CHAR(20) NOT NULL
    ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
    GO
    

    Now, I am going to insert 100 rows in Disk based table.

    -- Insert into Disk Based Table
    begin tran
    DECLARE @counter INT 
    SET @counter = 1 
    WHILE @counter <= 100
      BEGIN 
          INSERT INTO DiskBasedTable 
          VALUES      (@counter, 'SQLServer-Help.com','Balmukund Lakhani')
          SET @counter = @counter + 1 
      END
     commit tran
    

    Let’s examine the transaction log by well-known, undocumented (and unsupported command) fn_dblog

    SELECT *
    FROM sys.fn_dblog(NULL, NULL)
    WHERE PartitionId IN (
            SELECT partition_id
            FROM sys.partitions
            WHERE object_id = object_id('DiskBasedTable')
            )
    ORDER BY [Current LSN] ASC;
    GO
    

    As shown below, we can see 200 rows for our table (DiskBasedTable). LOP_INSERT_ROWS operation is logged for HEAP and INDEX. Since we have inserted 100 rows, it has caused 200 log records i.e. two per row. In Operation column – LOP stands for logical operation.

    image

    -- Total size of ALL Log Records 
    SELECT sum([Log Record Length])
    FROM sys.fn_dblog(NULL, NULL)
    WHERE PartitionId IN (
            SELECT partition_id
            FROM sys.partitions
            WHERE object_id = object_id('DiskBasedTable')
            )
    

    Above query shows output as 26000= ~25 KB. Now, we will do the same insert operation for in-memory table (MyFirstMemporyOptimizedTable) and look at log records.

    begin tran
    DECLARE @counter INT 
    SET @counter = 1
    WHILE @counter <= 100
      BEGIN 
          INSERT INTO MyFirstMemporyOptimizedTable
          VALUES      (@counter, 'SQLServer-Help.com','Balmukund Lakhani')
          SET @counter = @counter + 1 
      END
     commit tran
    
    Let’s examine log records for this operation. I am showing only top 3 records which are relevant for our discussion.
    SELECT *
    FROM sys.fn_dblog(NULL, NULL)
    ORDER BY [Current LSN] ASC
    GO
    
     
    image

    As we can see above that we have total three log records for 100 rows inserted. Have a look at length of the LOP_HK log record, its ~6 KB. We can use another undocumented function fn_dblog_xtp to crack-open that record. HK in the operation column is the artifact of feature’s code name Hekaton.

     -- Lets break-apart the LOP_HK
    SELECT [current lsn], [transaction id], operation,
    operation_desc, tx_end_timestamp, total_size,
    object_name(table_id) AS TableName
    FROM sys.fn_dblog_xtp(null, null)
    WHERE [Current LSN] = '0000001f:0000011c:0002'  -- this is same as seen in earlier image
    

    Here is the output in SSMS:

    image

    Top and bottom (Row # 1 and Row # 102) rows are HK_LOP_BEGIN_TX and HK_LOP_COMMIT_TX. HK_LOP_INSERT_ROW is seen 100 times. Above proves the concept that amount of data logged for in-memory table is lesser as compared to disk-based tables.

    Log records are not generated for any modification of Index

    In last screenshot, we can see that we don’t have any log records which contains information about index modification as oppose to disk-based table where we saw LCK_INDEX_LEAF.

    Transaction log records are written only if the transaction on in-memory table is committed

    For In-Memory OLTP transactions, log records are generated only at commit time. As oppose to disk-based table, in-memory OLTP doesn’t follow write-ahead logging (WAL) protocol. Not using WAL protocol makes it highly efficient. To prove this point, we would create a transaction and roll it back. We would again use fn_dblog and count the number of rows.

    Select count(*) from fn_dblog(null, null)
    go
    -- 1406 Rows
    begin tran
    update DiskBasedTable 
    set vFName =
    'ExtremeExperts.com' where iID = 10 rollback tran go Select count(*) from fn_dblog(null, null) go Select * from fn_dblog(null, null) order by [Current LSN] desc

    Below is an update which was rolled back and it has generated 4 log records.

    image

    LOP_BEGIN_XACT stands for begin transaction and LOP_ABORT_XACT stands for rollback/abort of transaction. Now, let’s do the same operation for in-memory table.
    Select count(*) from fn_dblog(null, null)
    go
    -- 1410 Rows
    begin tran
    update MyFirstMemporyOptimizedTable WITH (SNAPSHOT)
    set vFName = 'ExtremeExperts.com'
    where iID = 10
    rollback tran
    go
    Select count(*) from fn_dblog(null, null)
    go
    --1410 Rows
    

    image

    Before and after rows in transaction log file is same (i.e. 1410) which means no additional rows are generated for a uncommitted transaction, if it’s done for in-memory table.

    I hope that now you are clear about the difference in logging. In summary, we can say that logging of transaction on in-memory table is much more efficient as compare to disk based table transactions logging. Stay tuned for more on Hekaton.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle
  • Posted in A - Z Series, In Memory OLTP, In-Memory OLTP, SQL Server 2014 | Tagged: , , , , , , , , | 13 Comments »

    A-Z of In-Memory OLTP : Checkpoint – Part 2

    Posted by blakhani on December 24, 2013


    If you are reading this post before part 1, I would like you to read part 1 so that you have context of what you are reading. In part 1, we have seen demo and usage of DATA files. In this blog I am going to talk about DELTA files.

    As we saw earlier each DELTA file is paired with DATA file. I must call out that DATA and DELTA files are not related to a single table but they cover a transaction range (In fact, this was a question asked by one of the reader, Manoj (Blog) in my last post). In in-memory world, when a record is deleted from the table, it’s not removed from the DATA file but a identifier is inserted in DELTA file. This would tell SQL Server that during loading of the data (startup of database), don’t load this particular row in memory as it’s deleted. Now, imagine a situation when we are doing an update to a row.

    Let’s see this theory in action. I have used same script to populate the data and here is the position where we left in last blog. Here is the query

    Select count(*) from MyFirstMemporyOptimizedTable
    go
    SELECT internal_storage_slot,
           file_id, 
           pair_file_id,    
           file_type_desc,
           file_size_in_bytes / 1024.0 / 1024.0      [File Size in MB], 
           file_size_used_in_bytes / 1024.0 / 1024.0 [File Used in MB], 
           total_bytes / 1024.0 / 1024.0             [Total MB], 
           inserted_row_count ,
           deleted_row_count
    FROM   sys.dm_db_xtp_checkpoint_files 
    WHERE  total_bytes > 0        -- non zero size
    and       internal_storage_slot is not null 
    order by 1 desc
    

     

    image

    Now, I am going to delete one third of the rows and make sure to fire a manual checkpoint.

    Delete from MyFirstMemporyOptimizedTable
    where iID%3 = 0
    go
    checkpoint
    go
    
    ----------------------
    (3334 row(s) affected)

    If we run the old query again.

    image

    Here comes the DELTA files. Note that the internal_slot_id is zero for one pair and one for new DATA file.  After sometime, if we run the exact same query again, we would see a different output.

    image

    If we look at the file_id, its changed from the previous screenshot. It sounds like a checkpoint has fired by itself and has removed the reference of DELTA file. Current DATA file shows 6668 rows which is exact rows in table. There is no need of DELTA file here.

    Now, I am going to fire an update statement and modify all the rows and see the effect.

    update MyFirstMemporyOptimizedTable
    set vFName = 'www.ExtremeExperts.com'
    where vFName = 'ExtremeExperts.com'
    go
    checkpoint
    go
    


    ——————

    (3334 row(s) affected)

    image

    image

    This matches with the theory that UPDATE is split into DELETE and INSERT. We Updated 3334 rows, which caused append of 3334 rows into DELTA file and append of 3334 rows to DATA file.

    In above, “after update” we can see three DATA files and one DELTA file. Let’s fire one more query to understand the transaction range. Here is the query

    SELECT internal_storage_slot,
           file_id, 
           pair_file_id,    
           file_type_desc,
           inserted_row_count ,
           deleted_row_count,
           transaction_id_lower_bound,
           transaction_id_upper_bound
    FROM   sys.dm_db_xtp_checkpoint_files 
    WHERE  total_bytes > 0        -- non zero size
    and       internal_storage_slot is not null 
    order by 1 desc
    

     

    image

    Each internal_storage_slot is for a range of transaction. Lower bound and upper bound are matching for consecutive slots.

    It is important to recall that writing (appending to be precise) to DELTA and DATA file is NOT done by any user transaction but background checkpoint thread. In our tests we are firing manual checkpoint to see the immediate effect to learn which would not be the case in production server. The background thread, scans the log buffers (and if needed, transaction log file) to append to those files. As per Hekaton Whitepaper by Kalen, below are the steps done by checkpoint.

    1. All buffered writes are flushed to the data and delta files.

    2. A checkpoint inventory is constructed that includes descriptors for all files from the previous checkpoint plus any files added by the current checkpoint. The inventory is hardened to the transaction log.

    3. The location of the inventory is stored in the transaction log so that it is available at recovery time.

    Till, I wrote last few lines, another system process happened and here is the final output.

    image

    We would discuss about merge process in later part of the series.

    Hope you have already downloaded SQL Server 2014 CTP2 and learning In-Memory OLTP (a.k.a. Hekaton) with me. If you haven’t yet, be sure to check out the SQL Server 2014 CTP2 so that you can follow this series and get your hands dirty.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle
  • Posted in A - Z Series, Hekaton Series, In Memory OLTP, In-Memory OLTP, SQL Server 2014 | Tagged: , , , , , , , , | 3 Comments »