SQL 2014 Learning Series # 13 – New Feature – Managed Lock Priority (Part 1)
Posted by blakhani on June 17, 2014
Imagine a situation where DBA wants to perform maintenance operations like online index rebuild [using ALTER INDEX…REBUILD (ONLINE=ON)] or SWITCH partition [using ALTER TABLE… SWITCH PARTITION] on a highly OLTP system. Due to kind of locks taken by these activities, it’s highly possible that there would be blocking. If DDL is running first then it would block user workload. If user workload is already running and we need to perform DDL then our query/operation would be blocked. This might not be a problem for partition switch because it’s normally quick but can be a problem which you might have seen during online index rebuild.
Let’s understand these operation in little more detail to understand the reason of this feature.
Partition switch is a metadata operation which means it has to update some system tables and setup appropriate links in system tables so that SQL Server knows that you have bought a staging table into an existing table. To achieve this, SQL Server needed to take Schema Modification (SCH-M) lock on the source and the target table. Since SCH-M lock is incompatible with Shared Lock (S), this can block production users activity. The converse is also true. If we have a very heavy OLTP system and the table is very hot in terms on workload, it’s possible that we may not get a windows to acquire the lock and partition switch has to wait.
Online Index Rebuild (a.k.a. OIR) process requires table short table S lock at the beginning, in the build phase, where it need to take consistent snapshot of the table and create a version of the table. The version store is used to manage and create new index. After the index is built, in the last phase, process has to do a metadata changes (just like partition switch) where it needs SCH-M lock. This can be blocked or can block concurrent user workload depending on who came first. In order to execute the DDL statement for OIR, all active blocking transactions running on a particular table must be completed. Here also, converse is true.
In earlier version of SQL Server, there was no option to provide priority of lock taken by certain operation (except deadlock_priority). Now, we have option to provide our choice during online index rebuild and partition switch.
Kill all blockers – When we execute the DDL (either ALTER TABLE .. SWITCH PARTITON or ALTER INDEX… REBUILD (ONLINE=ON)), we can specify to kill all user session which are blocking the activity and start the DDL. This is typically a business decision depending on priority of user workload and time of the day. If we don’t want to kill the blockers immediately then we have an option to ask the operation to wait for certain time which can be specified by “MAX_DURATION” switch. The number provided after MAX_DURATION is number of minutes the DDL should wait before killing the sessions. At the end of the duration, if DDL process still cannot get a lock, it can go ahead and kill all user transactions. This fact might be intuitive but I must point out here – If we specify certain value in MAX_DURATION (let’s say 5 minutes) but there are no concurrent workload on the system taking conflicting locks, the request would be processes immediately (and will not wait for 5 minutes). Max value possible is 71,582 minutes (=49 days)
Switch to normal queue – This is the default behavior in SQL Server 2012 where DDL operation waits in the same lock queue where other transactions are waiting for lock. SQL Server lock manager is First In First Out (FIFO) model which means if we submit a lock request, it won’t be granted until the earlier requesters have been granted their lock and have been release also. This option means that if our DDL didn’t acquire the lock even after MAX_DURATION time is elapsed, it would switch to normal queue and wait behind existing user requests. This is default behavior.
Exit DDL after wait – This option is opposite to first option (kill all blockers). If business says that user workload is more important than these maintenance operation then DBA would pick this choice. As the name implies, the DDL (SWITCH/OIR) would wait till the MAX_DURATION and if the locks are not acquired for maintenance operation, the DDL would abort itself and end-user would get “Timeout Expired” message.
Using above three options, a DBA should be able to manager partition switch and online rebuild operation. This all the possible due to a new lock queue introduced in SQL Server 2014 called as “Low Priority Lock Queue” and that’s why it doesn’t interfere with regular user workload.
Below image shows that all SPIDs are waiting in same queue. SPID 53 is a maintenance activity which is having conflicting lock and hence waiting.
Below would be the situation with the new lock queue. As we can see SPID 54 is NOT waiting for SPID 53 (as compared to regular queue in earlier image)
Here is the syntax as shown in books online.
Few points I must highlight:
- MAX_DURATION is not the duration of the operation itself. It’s the max wait duration.
- Specifying very high value in MAX_DURATION would not be advisable because it prevents the transaction log from truncating from the point the DDL was submitted until it is executed . 71,582 minutes (1,193 hours or 49 days) unless you have the transaction log drive space to support 49 days’ worth of transactions, you might want to keep the MAX_DURATION a bit lower.
In next part of this series, we would look at example and demo of MLP feature.
To look at complete list of blog on SQL 2014 Learning Series, please visit here