Posted by blakhani on May 20, 2016
Working with Root Cause Analysis (RCA) is also part of my work at Microsoft. In case of cluster failover RCA, it is very important to get cluster log. Sometimes there are situation where we want to generate cluster log for last few minutes for quicker analysis of live issues. This blog explains some common parameters which I used in my day-to-day troubleshooting.
I have 4 nodes cluster in my lab named SRV1, SRV2, SRV3, SRV4.
- Default command – generates Cluster.log file on ALL nodes in C:\Windows\Cluster\Reports folder. File name would be Cluster.log
- if we want the cluster log to be generated for specific node(s) then we can use –Node parameter. We can put comma separated node names as shown below.
Get-ClusterLog -Node SRV1, SRV3
- You might know that the time shown in cluster log is UTC be default. Sometimes its difficult to translate UTC time to local time, specially for time-zones which has daylight saving. Luckily, cluster log can be generated in local time using parameter UseLocalTime . Here is the sample code.
- Another useful parameter is to copy the files to specific location. This command would generate logs and also dump on specified location. in below example, I am dumping logs from all nodes to C:\Temp folder.
Get-ClusterLog –Destination “C:\Temp”
- TimeSpan is another parameter which can generate cluster log for last number of minutes specified. By default it would generate Cluster.log for complete time. I find it useful when I repro’ed a problem and I want to look at cluster log for last 2 to 3 minutes. Here is the command to generate log for last 3 minutes.
Get-ClusterLog –TimeSpan 3
So, this is my favorite command after reproducing cluster issue on local node.
Get-ClusterLog -Node SRV1 -TimeSpan 2 -UseLocalTime -Destination C:\
Hopefully it would be useful.
Posted in Cluster, Tips and Tricks, Troubleshooting | Tagged: get-clusterlog, parameters, Tips and Tricks | 2 Comments »
Posted by blakhani on May 19, 2015
My job revolves around troubleshooting and fixing the broken thing. Here is one of the situation which I ran into recently and was unable to find solution on internet. It my responsibility to provide self-assist option to the SQL community so that they can find the problem and fix by themselves.
I was trying to install SQL Server 2012 on a 2 nodes Windows cluster. When I tried installing it, it failed with error in subject line. At this first look it sounds like some COM+ error but as always, setup logs are my first place to find the errors. Here is the MSDN link which explains the various files created by setup
https://msdn.microsoft.com/en-us/library/ms143702(v=sql.110).aspx (View and Read SQL Server Setup Log Files)
The information which I saw in setup logs was pretty interesting. In particular, I looked into Detail.txt file which is the parent file of all MSI logs. (I have removed date and time for better reading)
Error: Action "Microsoft.SqlServer.Configuration.SetupExtension.ValidateFeatureSettingsAction" threw an exception during execution.
Microsoft.SqlServer.Setup.Chainer.Workflow.ActionExecutionException: Element not found. (Exception from HRESULT: 0x80070490) —> System.Runtime.InteropServices.COMException (0x80070490): Element not found. (Exception from HRESULT: 0x80070490)
at Microsoft.SqlServer.Configuration.ClusterConfiguration.ClusterDiskPublicConfigObject.IsPathOnSharedDisk(String path)
at Microsoft.SqlServer.Configuration.SetupExtension.SlpInputSettings.ValidateNotOnSharedDisk(ValidationState vs, String directoryName, String bindingKey, String errorMessage)
at Microsoft.SqlServer.Configuration.SetupExtension.SlpInputSettings.Validate_InstallSharedDir(ValidationState vs)
at Microsoft.SqlServer.Configuration.SetupExtension.ValidateFeatureSettingsAction.ExecuteAction(String actionId)
at Microsoft.SqlServer.Chainer.Infrastructure.Action.Execute(String actionId, TextWriter errorStream)
at Microsoft.SqlServer.Setup.Chainer.Workflow.ActionInvocation.ExecuteActionHelper(TextWriter statusStream, ISequencedAction actionToRun)
The stack goes from bottom to Top. If we read the function calls made, anyone can conclude that there is something happening with cluster disks. That’s a good hint. So, I went back to failover cluster manager and looked into Disks under “Storage”. There was a disk which was in failed state. I was not able to bring this online and that’s THE problem! SQL Setup would enumerate the disks to find eligible disks which can be used and it’s not able to find details about that disk. Here was the error when I attempted to bring it online.
The resource ‘Cluster Disk 1’ did not come online.
The desired state change for ‘Cluster Disk 1’ did not occur before the timeout expired.
I realized that I have played with iSCSI and messed up the disk which was presented.
Solution: Delete the disks which are not able to come online under “Storage > Disks” or “Available Storage” in failover cluster manager interface.
Hope this helps.
Author: SQL Server 2012 AlwaysOn – Paperback, Kindle
Posted in Cluster, Installation, Setup, SQL Server 2012, SQL Server 2014 | Tagged: 0x80070490, Cluster, Element not found, setup, sql server | 3 Comments »