Posted by blakhani on May 19, 2015
My job revolves around troubleshooting and fixing the broken thing. Here is one of the situation which I ran into recently and was unable to find solution on internet. It my responsibility to provide self-assist option to the SQL community so that they can find the problem and fix by themselves.
I was trying to install SQL Server 2012 on a 2 nodes Windows cluster. When I tried installing it, it failed with error in subject line. At this first look it sounds like some COM+ error but as always, setup logs are my first place to find the errors. Here is the MSDN link which explains the various files created by setup
https://msdn.microsoft.com/en-us/library/ms143702(v=sql.110).aspx (View and Read SQL Server Setup Log Files)
The information which I saw in setup logs was pretty interesting. In particular, I looked into Detail.txt file which is the parent file of all MSI logs. (I have removed date and time for better reading)
Error: Action "Microsoft.SqlServer.Configuration.SetupExtension.ValidateFeatureSettingsAction" threw an exception during execution.
Microsoft.SqlServer.Setup.Chainer.Workflow.ActionExecutionException: Element not found. (Exception from HRESULT: 0x80070490) —> System.Runtime.InteropServices.COMException (0x80070490): Element not found. (Exception from HRESULT: 0x80070490)
at Microsoft.SqlServer.Configuration.ClusterConfiguration.ClusterDiskPublicConfigObject.IsPathOnSharedDisk(String path)
at Microsoft.SqlServer.Configuration.SetupExtension.SlpInputSettings.ValidateNotOnSharedDisk(ValidationState vs, String directoryName, String bindingKey, String errorMessage)
at Microsoft.SqlServer.Configuration.SetupExtension.SlpInputSettings.Validate_InstallSharedDir(ValidationState vs)
at Microsoft.SqlServer.Configuration.SetupExtension.ValidateFeatureSettingsAction.ExecuteAction(String actionId)
at Microsoft.SqlServer.Chainer.Infrastructure.Action.Execute(String actionId, TextWriter errorStream)
at Microsoft.SqlServer.Setup.Chainer.Workflow.ActionInvocation.ExecuteActionHelper(TextWriter statusStream, ISequencedAction actionToRun)
The stack goes from bottom to Top. If we read the function calls made, anyone can conclude that there is something happening with cluster disks. That’s a good hint. So, I went back to failover cluster manager and looked into Disks under “Storage”. There was a disk which was in failed state. I was not able to bring this online and that’s THE problem! SQL Setup would enumerate the disks to find eligible disks which can be used and it’s not able to find details about that disk. Here was the error when I attempted to bring it online.
The resource ‘Cluster Disk 1’ did not come online.
The desired state change for ‘Cluster Disk 1’ did not occur before the timeout expired.
I realized that I have played with iSCSI and messed up the disk which was presented.
Solution: Delete the disks which are not able to come online under “Storage > Disks” or “Available Storage” in failover cluster manager interface.
Hope this helps.
Author: SQL Server 2012 AlwaysOn – Paperback, Kindle
Posted in Cluster, Installation, Setup, SQL Server 2012, SQL Server 2014 | Tagged: 0x80070490, Cluster, Element not found, setup, sql server | 3 Comments »
Posted by blakhani on April 14, 2015
This is one of the common question asked. This blog has list of resources which can be useful in getting answer. First lets look at books online.
Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server)
Maximum number of availability groups and availability databases per computer: The actual number of databases and availability groups you can put on a computer (VM or physical) depends on the hardware and workload, but there is no enforced limit. Microsoft has extensively tested with 10 AGs and 100 DBs per physical machine. Signs of overloaded systems can include, but are not limited to, worker thread exhaustion, slow response times for AlwaysOn system views and DMVs, and/or stalled dispatcher system dumps. Please make sure to thoroughly test your environment with a production-like workload to ensure it can handle peak workload capacity within your application SLAs. When considering SLAs be sure to consider load under failure conditions as well as expected response times.
In general, the more databases that are replicated and the more secondary replicas that exist – the more worker threads and more memory that will be consumed just to have the AlwaysOn infrastructure. As the text above indicates, there is no enforced limit, but the more you have the more worker threads and memory will be needed. If there are insufficient worker threads you will probably see error messages in the SQL Error log similar to:
“The thread pool for AlwaysOn Availability Groups was unable to start a new worker thread because there are not enough available worker threads. This may degrade AlwaysOn Availability Groups performance. Use the "max worker threads" configuration option to increase number of allowable threads.”
If starved for memory, you could see many different error messages – that may or may not look like they relate to AlwaysOn. One possible message could be:
“Could not start the AlwaysOn Availability Groups transport manager. This failure probably occurred because a low memory condition existed when the message dispatcher started up. If so, other internal tasks might also have experienced errors. Check the SQL Server error log and the Windows error log for additional error messages. If a low memory condition exists, investigate and correct its cause.”
Here are other blogs which explain the number of threads in worker pool to support availability group.
AlwaysOn – HADRON Learning Series: Worker Pool Usage for HADRON enabled databases
Monitoring SQL Server 2012 AlwaysOn Availability Groups Worker Thread Consumption
Hope this helps.
Posted in AlwaysOn, SQL Server 2012, SQL Server 2014 | Tagged: AG, AlwaysOn, Availability Group, sql server | Leave a Comment »
Posted by blakhani on March 25, 2015
Recently someone posted on Facebook group about a problem. He informed that whenever the backups are taken in SQL Server 2014 instance, it’s failing with error as below.
Msg 3013, Level 16, State 1, Line 4
BACKUP DATABASE is terminating abnormally.
When we open ERRORLOG we saw below
***Stack Dump being sent to D:\MSSQL\LOG\SQLDump0089.txt
SqlDumpExceptionHandler: Process 1144 generated fatal exception c0000005 EXCEPTION_ACCESS_VIOLATION. SQL Server is terminating this process.
* BEGIN STACK DUMP:
* 01/01/15 18:05:25 spid 1144
* Private server build.
* Exception Address = 00007FF85B652D45 Module(sqlmin+00000000009D2D45)
* Exception Code = c0000005 EXCEPTION_ACCESS_VIOLATION
* Access Violation occurred reading address 00007FFA0C208980
* Input Buffer 116 bytes –
* backup database test to disk=’c:\temp\test.bak’
As we can see that Access Violation is occurring while running backup database command. Whenever there is a dump generated, there is a MDMP file is also generated. Here is the stack when we analyze the dump. Yon can refer http://mssqlwiki.com/2012/10/16/sql-server-exception_access_violation-and-sql-server-assertion link to know more about identifying stack.
sqlmin!BackupFileDesc::InitPerfCounters+0xc <<Performance Counters Initialization.
sqlmin!SubprocEntrypoint+0xa7f << This is the child thread of main backup thread.
Here is the version of SQL Server.
Microsoft SQL Server 2014 – 12.0.2000.8 (X64)
Feb 20 2014 20:04:26
Copyright (c) Microsoft Corporation
Enterprise Edition: Core-based Licensing (64-bit) on Windows NT 6.3 <X64> (Build 9600: ) (Hypervisor)
This is identified as an issue with SQL Server 2014 RTM version. Here is the KB article – http://support.microsoft.com/en-us/kb/2973444 (FIX: "SQL Server performance counters are disabled" when you move the SQL Server resource in SQL Server 2014)
Fix: Cumulative Update 2 for SQL Server 2014 or any later version of SQL Server 2014.
If you are running higher version and above fix then you need to find out why performance monitor is not showing SQL Server related counters. The way to verify it would be to use DMV
select * from sys.dm_os_performance_counters
You should see many counters specifically look for “Backup/Restore Throughput/sec”. In SQL 2014 this is a new performance counter added to get backup/restore speed and that missing counter (Backup/Restore Throughput/sec) which is causing backup to fail.
Author: SQL Server 2012 AlwaysOn – Paperback, Kindle
Posted in EXCEPTION_ACCESS_VIOLATION, SQL Server, SQL Server 2014 | Tagged: backup, EXCEPTION_ACCESS_VIOLATION, SQL Server 2014 | 13 Comments »