Help: SQL Server

Sharing my knowlege about SQL Server Troubleshooting Skills

Archive for the ‘SQL Server 2012’ Category

SQL Cluster Setup Error – System.Runtime.InteropServices.COMException (0x80070490): Element not found

Posted by blakhani on May 19, 2015


My job revolves around troubleshooting and fixing the broken thing. Here is one of the situation which I ran into recently and was unable to find solution on internet. It my responsibility to provide self-assist option to the SQL community so that they can find the problem and fix by themselves.

I was trying to install SQL Server 2012 on a 2 nodes Windows cluster. When I tried installing it, it failed with error in subject line. At this first look it sounds like some COM+ error but as always, setup logs are my first place to find the errors. Here is the MSDN link which explains the various files created by setup

https://msdn.microsoft.com/en-us/library/ms143702(v=sql.110).aspx (View and Read SQL Server Setup Log Files)

The information which I saw in setup logs was pretty interesting. In particular, I looked into Detail.txt file which is the parent file of all MSI logs. (I have removed date and time for better reading)

Error: Action "Microsoft.SqlServer.Configuration.SetupExtension.ValidateFeatureSettingsAction" threw an exception during execution.
Microsoft.SqlServer.Setup.Chainer.Workflow.ActionExecutionException: Element not found. (Exception from HRESULT: 0x80070490) —> System.Runtime.InteropServices.COMException (0x80070490): Element not found. (Exception from HRESULT: 0x80070490)
   at Microsoft.SqlServer.Interop.MSClusterLib.ISClusResource.get_Disk()
   at Microsoft.SqlServer.Configuration.Cluster.ClusterPhysicalDisk.get_Partitions()
   at Microsoft.SqlServer.Configuration.ClusterConfiguration.ClusterDiskPublicConfigObject.IsPathOnSharedDisk(String path)
   at Microsoft.SqlServer.Configuration.SetupExtension.SlpInputSettings.ValidateNotOnSharedDisk(ValidationState vs, String directoryName, String bindingKey, String errorMessage)
   at Microsoft.SqlServer.Configuration.SetupExtension.SlpInputSettings.Validate_InstallSharedDir(ValidationState vs)
   at Microsoft.SqlServer.Configuration.SetupExtension.SlpInputSettings.ValidateSettings()
   at Microsoft.SqlServer.Configuration.SetupExtension.ValidateFeatureSettingsAction.ExecuteAction(String actionId)
   at Microsoft.SqlServer.Chainer.Infrastructure.Action.Execute(String actionId, TextWriter errorStream)
   at Microsoft.SqlServer.Setup.Chainer.Workflow.ActionInvocation.ExecuteActionHelper(TextWriter statusStream, ISequencedAction actionToRun)

 

The stack goes from bottom to Top. If we read the function calls made, anyone can conclude that there is something happening with cluster disks. That’s a good hint. So, I went back to failover cluster manager and looked into Disks under “Storage”. There was a disk which was in failed state. I was not able to bring this online and that’s THE problem! SQL Setup would enumerate the disks to find eligible disks which can be used and it’s not able to find details about that disk. Here was the error when I attempted to bring it online.

The resource ‘Cluster Disk 1’ did not come online.
The desired state change for ‘Cluster Disk 1’ did not occur before the timeout expired.

I realized that I have played with iSCSI and messed up the disk which was presented.

Solution: Delete the disks which are not able to come online under “Storage > Disks” or “Available Storage” in failover cluster manager interface.

Hope this helps.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle
  • Advertisements

    Posted in Cluster, Installation, Setup, SQL Server 2012, SQL Server 2014 | Tagged: , , , , | 3 Comments »

    AlwaysOn – How many databases can be added in Availability Group? Any hard limit?

    Posted by blakhani on April 14, 2015


    This is one of the common question asked. This blog has list of resources which can be useful in getting answer. First lets look at books online.

    Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server)
    http://msdn.microsoft.com/en-us/library/ff878487.aspx

    Maximum number of availability groups and availability databases per computer: The actual number of databases and availability groups you can put on a computer (VM or physical) depends on the hardware and workload, but there is no enforced limit. Microsoft has extensively tested with 10 AGs and 100 DBs per physical machine. Signs of overloaded systems can include, but are not limited to, worker thread exhaustion, slow response times for AlwaysOn system views and DMVs, and/or stalled dispatcher system dumps. Please make sure to thoroughly test your environment with a production-like workload to ensure it can handle peak workload capacity within your application SLAs. When considering SLAs be sure to consider load under failure conditions as well as expected response times.

    In general, the more databases that are replicated and the more secondary replicas that exist – the more worker threads and more memory that will be consumed just to have the AlwaysOn infrastructure.  As the text above indicates, there is no enforced limit, but the more you have the more worker threads and memory will be needed.   If there are insufficient worker threads you will probably see error messages in the SQL Error log similar to:

    “The thread pool for AlwaysOn Availability Groups was unable to start a new worker thread because there are not enough available worker threads.  This may degrade AlwaysOn Availability Groups performance.  Use the "max worker threads" configuration option to increase number of allowable threads.”

    If starved for memory, you could see many different error messages – that may or may not look like they relate to AlwaysOn. One possible message could be:

    “Could not start the AlwaysOn Availability Groups transport manager. This failure probably occurred because a low memory condition existed when the message dispatcher started up. If so, other internal tasks might also have experienced errors. Check the SQL Server error log and the Windows error log for additional error messages. If a low memory condition exists, investigate and correct its cause.”

    Here are other blogs which explain the number of threads in worker pool to support availability group.

    AlwaysOn – HADRON Learning Series:  Worker Pool Usage for HADRON enabled databases
    http://blogs.msdn.com/b/psssql/archive/2012/05/17/alwayson-hadron-learning-series-worker-pool-usage-for-hadron-enabled-databases.aspx

    Monitoring SQL Server 2012 AlwaysOn Availability Groups Worker Thread Consumption
    http://blogs.msdn.com/b/sql_pfe_blog/archive/2013/07/15/monitoring-sql-server-2012-alwayson-availability-groups-worker-thread-consumption.aspx

    Hope this helps.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Posted in AlwaysOn, SQL Server 2012, SQL Server 2014 | Tagged: , , , | Leave a Comment »

    Solution : Suspect Database – SQL Server detected a DTC/KTM in-doubt transaction with UOW .Please resolve it following the guideline for Troubleshooting DTC Transactions.

    Posted by blakhani on January 27, 2015


    While testing application which uses distributed transaction, I caused a failure of SQL Server Database. When I performed failover I found that database didn’t recover and failed to come online. It was shown as “Suspect” in SQL Server Management Studio. I checked ERRORLOG and found below messages.

    2015-01-22 22:50:33.570    spid15s    Recovery of database ‘MyBadDatabase’ (5) is 0% complete (approximately 24 seconds remain). Phase 1 of 3. This is an informational message only. No user action is required.
    2015-01-22 22:50:33.990    spid15s    Recovery of database ‘MyBadDatabase’ (5) is 2% complete (approximately 24 seconds remain). Phase 1 of 3. This is an informational message only. No user action is required.
    2015-01-22 22:50:34.080    spid15s    Recovery of database ‘MyBadDatabase’ (5) is 0% complete (approximately 130 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
    2015-01-22 22:50:34.180    spid15s    Recovery of database ‘MyBadDatabase’ (5) is 11% complete (approximately 5 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
    2015-01-22 22:50:35.480    spid15s    Recovery of database ‘MyBadDatabase’ (5) is 44% complete (approximately 2 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
    2015-01-22 22:50:37.720    spid15s    10704 transactions rolled forward in database ‘MyBadDatabase’ (5). This is an informational message only. No user action is required.
    2015-01-22 22:50:37.820    spid7s    Attempting to initialize Microsoft Distributed Transaction Coordinator (MS DTC). This is an informational message only. No user action is required.
    2015-01-22 22:50:37.850    spid15s    CHECKDB for database ‘MyBadDatabase’ finished without errors on 2012-01-08 23:41:18.113 (local time). This is an informational message only; no user action is required.
    2015-01-22 22:50:38.870    spid7s    Attempting to recover in-doubt distributed transactions involving Microsoft Distributed Transaction Coordinator (MS DTC). This is an informational message only. No user action is required.
    2015-01-22 22:52:22.990    spid7s    SQL Server detected a DTC/KTM in-doubt transaction with UOW  {6BAC37B8-6515-4FC1-972B-C059D1D5133E}.Please resolve it following the guideline for Troubleshooting DTC Transactions.
    2015-01-22 22:52:22.990    spid7s    Error: 3437, Severity: 21, State: 3.
    2015-01-22 22:52:22.990    spid7s    An error occurred while recovering database ‘MyBadDatabase’. Unable to connect to Microsoft Distributed Transaction Coordinator (MS DTC) to check the completion status of transaction (0:743395420). Fix MS DTC, and run recovery again.
    2015-01-22 22:52:22.990    spid7s    Error: 3414, Severity: 21, State: 2.
    2015-01-22 22:52:22.990    spid7s    An error occurred during recovery, preventing the database ‘MyBadDatabase’ (database ID 5) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support.

     

    As we can see that database was trying to recover. Phase 2 of roll forward is complete and Phase 3 (Rollback phase) encountered an error. The error is cause due to the fact that my bad application uses MSDTC and I caused things to break in MSDTC itself. Ideal way to fix the problem would be to follow KB http://support.microsoft.com/default.aspx?scid=kb;EN-US;306366 (PRB: Errors 3437, 3414, and "Could Not Connect to DTC" Occur When You Start SQL Server)

    In my case, I have restarted DTC and I was not able to find UOW anywhere which I need to kill. When I checked sys.dm_tran_locks for column request_owner_guid for any GUID that is non ZERO. But I didn’t find any non Zero (6BAC37B8-6515-4FC1-972B-C059D1D5133E). Here are the steps I did to fix my database.

    sp_configure 'show advanced options', 1 
    GO 
    RECONFIGURE 
    GO 
    sp_configure 'in-doubt xact resolution', 2 --- presume abort 
    GO 
    RECONFIGURE 
    GO 
    
    

    Error log showed the change.

    2015-01-23 00:45:53.850 spid86       Configuration option ‘in-doubt xact resolution’ changed from 0 to 2. Run the RECONFIGURE statement to install.

    In your case, you need to check with application vendor/team to find what needs to be done with In-doubt transactions. Please use appropriate value (I have used 2) and then perform recovery of database.

    dbcc dbrecover(MyBadDatabase)
    

     

    Once above steps are performed, I was able to get database recovered. Here is the snippet from errorlog.

    2015-01-23 00:46:10.090 spid57       Bypassing recovery for database ID 5. This is an informational message only. No user action is required.

    2015-01-23 00:46:10.100 spid57       Starting up database ‘MyBadDatabase’.

    2015-01-23 00:47:28.570 spid57       Recovery of database ‘MyBadDatabase’ (5) is 0% complete (approximately 130 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.

    2015-01-23 00:47:28.670 spid57       Recovery of database ‘MyBadDatabase’ (5) is 1% complete (approximately 7 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.

    2015-01-23 00:47:29.630 spid57       Recovery of database ‘MyBadDatabase’ (5) is 29% complete (approximately 2 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.

    2015-01-23 00:47:34.800 spid57       1 transactions rolled back in database ‘MyBadDatabase’ (5). This is an informational message only. No user action is required. 2015-01-23 00:47:34.800 spid57       Recovery is writing a checkpoint in database ‘MyBadDatabase’ (5). This is an informational message only. No user action is required. 2015-01-23 00:47:34.820 spid57       Recovery completed for database MyBadDatabase (database ID 5) in 6 second(s) (analysis 20 ms, redo 3142 ms, undo 3056 ms.) This is an informational message only. No user action is required.

    2015-01-23 00:47:34.860 spid57       CHECKDB for database ‘MyBadDatabase’ finished without errors on 2012-01-08 23:41:18.113 (local time). This is an informational message only; no user action is required.

    Hopefully this would bring database online if it was caused due to DTC problem. There might be various causes for suspect database and this blog just covers one of them.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle
  • Posted in SQL Server, SQL Server 2012, SQL Server 2014 | Leave a Comment »