Help: SQL Server

Sharing my knowlege about SQL Server Troubleshooting Skills

Archive for the ‘Tips and Tricks’ Category

Tips and Tricks: Useful Parameters of Get-ClusterLog

Posted by blakhani on May 20, 2016


Working with Root Cause Analysis (RCA) is also part of my work at Microsoft. In case of cluster failover RCA, it is very important to get cluster log. Sometimes there are situation where we want to generate cluster log for last few minutes for quicker analysis of live issues. This blog explains some common parameters which I used in my day-to-day troubleshooting.

I have 4 nodes cluster in my lab named SRV1, SRV2, SRV3, SRV4.

  • Default command – generates Cluster.log file on ALL nodes in C:\Windows\Cluster\Reports folder. File name would be Cluster.log

Get-ClusterLog 

  • if we want the cluster log to be generated for specific node(s) then we can use –Node parameter. We can put comma separated node names as shown below.

Get-ClusterLog -Node SRV1, SRV3

  • You might know that the time shown in cluster log is UTC be default. Sometimes its difficult to translate UTC time to local time, specially for time-zones which has daylight saving. Luckily, cluster log can be generated in local time using parameter UseLocalTime . Here is the sample code.

Get-ClusterLog –UseLocalTime

  • Another useful parameter is to copy the files to specific location. This command would generate logs and also dump on specified location. in below example, I am dumping logs from all nodes to C:\Temp folder.

Get-ClusterLog –Destination “C:\Temp”

  • TimeSpan is another parameter which can generate cluster log for last number of minutes specified. By default it would generate Cluster.log for complete time. I find it useful when I repro’ed a problem and I want to look at cluster log for last 2 to 3 minutes. Here is the command to generate log for last 3 minutes.

Get-ClusterLog –TimeSpan 3

So, this is my favorite command after reproducing cluster issue on local node.

Get-ClusterLog -Node SRV1 -TimeSpan 2 -UseLocalTime -Destination C:\

Hopefully it would be useful.

Cheers,
Balmukund

Advertisements

Posted in Cluster, Tips and Tricks, Troubleshooting | Tagged: , , | 4 Comments »

Help : Unable to start SQL Service after installing Service Pack or Cumulative Upgrade or Hotfix

Posted by blakhani on March 10, 2015


This is one of the common issue which a DBA might have seen. Let’s first understand what happens whenever an upgrade is performed for SQL Server. During installation of patch the binaries are updated, resource database is replaced. During the restart, after successful completion, the upgrade scripts are run. This is the time when Login would fail with below error.

2015-01-05 10:45:03.23 Logon       Error: 18401, Severity: 14, State: 1.
2015-01-05 10:45:03.23 Logon       Login failed for user ‘NT AUTHORITY\SYSTEM’. Reason: Server is in script upgrade mode. Only administrator can connect at this time. [CLIENT: <local machine>]

This error is normal for sometime till upgrade scripts have completed execution. It has been observed that sometimes, due to system configuration, the script don’t run successfully. If upgrade script has not run, SQL would shutdown automatically and below would be the message in SQL ERRORLOGs.

2015-01-05 10:45:07.94 spid9s      Error: 3417, Severity: 21, State: 3.
2015-01-05 10:45:07.94 spid9s      Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.

Above message is very scary and suggest you to do something which you may not want. Now, If you are into situation where upgrade is not complete, there are few things you should try. Very first thing is look at ERRORLOG and locate a place where you would find some error during script upgrade. Here are few sample errors which you might see.

Error 1 – CREATE DATABASE failed

2015-01-05 10:45:02.55 spid9s      Database ‘master’ is upgrading script ‘sqlagent90_sysdbupg.sql’ from level 0 to level 2.
2015-01-05 10:45:07.94 spid9s      Error: 5133, Severity: 16, State: 1.
2015-01-05 10:45:07.94 spid9s      Directory lookup for the file "F:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\temp_MS_AgentSigningCertificate_database.mdf" failed with the operating system error 3(The system cannot find the path specified.).
2015-01-05 10:45:07.94 spid9s      Error: 1802, Severity: 16, State: 1.
2015-01-05 10:45:07.94 spid9s      CREATE DATABASE failed. Some file names listed could not be created. Check related errors.
2015-01-05 10:45:07.94 spid9s      Error: 912, Severity: 21, State: 2.
2015-01-05 10:45:07.94 spid9s      Script level upgrade for database ‘master’ failed because upgrade step ‘sqlagent90_sysdbupg.sql’ encountered error 598, state 1, severity 25. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error happened during upgrade of the ‘master’ database, it will prevent the entire SQL Server instance from starting. Examine the previous errorlog entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.
2015-01-05 10:45:07.94 spid9s      Error: 3417, Severity: 21, State: 3.
2015-01-05 10:45:07.94 spid9s      Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.

Okay. Problem here is that F:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA folder is missing as we are getting error “The system cannot find the path specified”. The database default path is stored in registry and this is explained here. All we need to do is change the registry to correct value and start SQL Services.

Error 2 – affinity mask related error

2015-01-01 00:00:06.79 spid7s      Database ‘master’ is upgrading script ‘sqlagent100_msdb_upgrade.sql’ from level 0 to level 3.
2015-01-01 00:00:06.79 spid7s      —————————————-
2015-01-01 00:00:06.79 spid7s      Starting execution of PREINSTMSDB100.SQL
2015-01-01 00:00:06.79 spid7s      —————————————-
..
2015-01-01 00:00:06.97 spid7s      Setting database option COMPATIBILITY_LEVEL to 100 for database msdb.
2015-01-01 00:00:07.01 spid7s      Configuration option ‘allow updates’ changed from 1 to 1. Run the RECONFIGURE statement to install.
2015-01-01 00:00:07.01 spid7s      Configuration option ‘allow updates’ changed from 1 to 1. Run the RECONFIGURE statement to install.
2015-01-01 00:00:07.01 spid7s      Error: 5833, Severity: 16, State: 1.
2015-01-01 00:00:07.01 spid7s      The affinity mask specified is greater than the number of CPUs supported or licensed on this edition of SQL Server.
2015-01-01 00:00:07.01 spid7s      Error: 912, Severity: 21, State: 2.
2015-01-01 00:00:07.01 spid7s      Script level upgrade for database ‘master’ failed because upgrade step ‘sqlagent100_msdb_upgrade.sql’ encountered error 5833, state 1, severity 16. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error happened during upgrade of the ‘master’ database, it will prevent the entire SQL Server instance from starting. Examine the previous errorlog entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.
2015-01-01 00:00:07.02 spid7s      Error: 3417, Severity: 21, State: 3.
2015-01-01 00:00:07.02 spid7s      Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.

To fix above problem, we can start the Instance by skipping the post upgrade script by adding trace flag –T902 to the startup parameters. Steps to enable trace flag can be found in this KB Once SQL is started and we should be able to connect to SQL Server normally. Then connect via SSMS and check the checkbox “Automatically set processor affinity mask for all processors” under instance properties. This could have also been archived using the below TSQL.

ALTER SERVER CONFIGURATION SET PROCESS AFFINITY CPU = AUTO
GO

Once configuration change as been done, we need to remove the trace flag and then start SQL normally. Post restart, we need to check SQL Server ERRORLOG and make sure we have below line.

Recovery is complete. This is an informational message only. No user action is required

This message during upgrade scenario means that post upgrade script have been executed successfully.

If you have encountered any error during upgrade, please comment and let me know.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle
  • Posted in SQL Server, Tips and Tricks, Trace Flag | Tagged: | Leave a Comment »

    Tips and Tricks : Creating Linked Server to an Availability Group Listener with ReadOnly routing

    Posted by blakhani on September 30, 2014


    In my recent interaction with customer, I was asked if linked server can be created to an availability group listener? My immediate response was “Yes”. Next question was immediately thrown at me that if they can redirect this listener to secondary replica? This time my little delayed response was – Yes. Provided we use proper connection string. Here is the linked server creation script

     EXEC sp_addlinkedserver @server = N'MyLinkedServer'
        ,@provider = N'SQLNCLI'
        ,@srvproduct = N'SqlServer'
        ,@datasrc = N'MyListener'
        ,@provstr = N'ApplicationIntent=ReadOnly'
    
    
    

    …and it was not working as they expected.

    Whenever we want to use routing of read-only connection via listener and connections are not going to secondary replica then below checklist should help.

    Here is the checklist which I have created.

    • Verify that we are connecting to Listener Name.
    • Verify that the Read-Only routing list (READ_ONLY_ROUTING_LIST) is defined.
    • Verify that the Routing URL (READ_ONLY_ROUTING_URL) of each instance has a proper FQDN and port combination.
    • Verify that ApplicationIntent is specified in the connection string.
    • Verify that the Sync_State is SYNCHRONIZED or SYNCHORNIZING for the secondary replica
    • Verify that the secondary replicas are set to allow connections
    • Verity that the initial catalog is provided in connection string.

    If complete checklist is followed, it should resolve most of the routing problems. Most common mistake which I have seen is to miss creating routing URL and routing list. This may be due to the fact that there is no User Interface available and it has to be done by T-SQL.

    Coming back to original question – I asked them to use proper connection string while creating listener. They followed the checklist and they have missed last item the initial catalog. Here is my version of the script.

    USE [master]
    GO
    EXEC master.dbo.sp_addlinkedserver @server = N'MyLinkedServer'
        ,@srvproduct = N'SQL'
        ,@provider = N'SQLNCLI11'
        ,@datasrc = N'MyListener'
        ,@provstr = N'Integrated Security=SSPI;Initial Catalog=Production;Data Source=MyListener;ApplicationIntent=ReadOnly'
        ,@catalog = N'Production'
    GO

    How would you test whether the connection is working fine or not? That’s simple.

    select * from openquery(MyLinkedServer,'select @@servername')
    

     

    If Server name is returned as secondary replica then routing is working perfectly fine. If you are getting primary replica then routing has a problem. The best way to troubleshoot further would be to use SQLCMD and follow the checklist provided earlier in this blog.

    sqlcmd -S MyListener -E -d Production -K ReadOnly

    Hope this helps!

    P.S. Checklist is explained in detail in my book SQL Server 2012 AlwaysOnPaperback, Kindle

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Posted in Tips and Tricks | Tagged: , , , , , | 5 Comments »