Help: SQL Server

Sharing my knowlege about SQL Server Troubleshooting Skills

Archive for the ‘AlwaysOn’ Category

AlwaysOn – How many databases can be added in Availability Group? Any hard limit?

Posted by blakhani on April 14, 2015


This is one of the common question asked. This blog has list of resources which can be useful in getting answer. First lets look at books online.

Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server)
http://msdn.microsoft.com/en-us/library/ff878487.aspx

Maximum number of availability groups and availability databases per computer: The actual number of databases and availability groups you can put on a computer (VM or physical) depends on the hardware and workload, but there is no enforced limit. Microsoft has extensively tested with 10 AGs and 100 DBs per physical machine. Signs of overloaded systems can include, but are not limited to, worker thread exhaustion, slow response times for AlwaysOn system views and DMVs, and/or stalled dispatcher system dumps. Please make sure to thoroughly test your environment with a production-like workload to ensure it can handle peak workload capacity within your application SLAs. When considering SLAs be sure to consider load under failure conditions as well as expected response times.

In general, the more databases that are replicated and the more secondary replicas that exist – the more worker threads and more memory that will be consumed just to have the AlwaysOn infrastructure.  As the text above indicates, there is no enforced limit, but the more you have the more worker threads and memory will be needed.   If there are insufficient worker threads you will probably see error messages in the SQL Error log similar to:

“The thread pool for AlwaysOn Availability Groups was unable to start a new worker thread because there are not enough available worker threads.  This may degrade AlwaysOn Availability Groups performance.  Use the "max worker threads" configuration option to increase number of allowable threads.”

If starved for memory, you could see many different error messages – that may or may not look like they relate to AlwaysOn. One possible message could be:

“Could not start the AlwaysOn Availability Groups transport manager. This failure probably occurred because a low memory condition existed when the message dispatcher started up. If so, other internal tasks might also have experienced errors. Check the SQL Server error log and the Windows error log for additional error messages. If a low memory condition exists, investigate and correct its cause.”

Here are other blogs which explain the number of threads in worker pool to support availability group.

AlwaysOn – HADRON Learning Series:  Worker Pool Usage for HADRON enabled databases
http://blogs.msdn.com/b/psssql/archive/2012/05/17/alwayson-hadron-learning-series-worker-pool-usage-for-hadron-enabled-databases.aspx

Monitoring SQL Server 2012 AlwaysOn Availability Groups Worker Thread Consumption
http://blogs.msdn.com/b/sql_pfe_blog/archive/2013/07/15/monitoring-sql-server-2012-alwayson-availability-groups-worker-thread-consumption.aspx

Hope this helps.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Posted in AlwaysOn, SQL Server 2012, SQL Server 2014 | Tagged: , , , | Leave a Comment »

    AlwaysOn : Better Together – Using AlwaysOn Availability Group and Log Shipping

    Posted by blakhani on July 3, 2014


    In recent past once of my friend came with a requirement where she wanted to use log shipping along with AlwaysOn Availability Group. I asked “are you not happy with 4 secondary replicas in SQL 2012 and 8 secondary replicas in SQL 2014? Do you want more?” and she said that it’s not because of number of secondary replica, its about controlling the data movement and restore delay on secondary database in log-shipping. This made me think broader and here is list of reasons why someone would use AlwaysOn Availability Group and Log Shipping together.

    • Delayed Recovery – In Log-shipping we can have definite delay on secondary database. This would safe guard DBA from “oops!” and “was that the production server?” situations. Log shipping can control the delay of transaction log restore while Asynchronous secondary replica can not.
    • Single DR Server – Single Server can be used as multiple log-shipping pair’s destination.
    • Infrastructure – Server at DR site can’t be a part of current cluster due to infrastructure limitations.
    • Technical – Secondary server is already a part of different windows failover cluster. In availability group, we can’t have overlap of nodes by two windows clusters.

    Keeping all of the above in mind, someone might want to use AlwaysOn Availability Group and Log Shipping together.

    Here is a typical topology which can be deployed.

    image

    Number of replicas can be based on your own environment. Few important points to note:

    • We need to configure both servers in availability group as primary for log shipping with same destination. This can be done by script or UI, based on you choice.
    • Log Shipping knows about backup preference and backup priority for availability group. This means that we can offload backups on secondary.
    • Backup should be taken to shared location so that the copy job is always using same location to pick the files even after failover/role change of availability group.

    In my next blog, I would show step by step deployment guide of this configuration.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle
  • Posted in AlwaysOn | Tagged: , , | 7 Comments »

    Solution : The connection to the primary replica is not active. The command cannot be processed.

    Posted by blakhani on July 1, 2014


    It has been close to a year since I published my first book (SQL Server 2012 AlwaysOnPaperback, Kindle) and since then I have been contacted by many DBA to troubleshoot various issue related to AlwaysOn Availability Groups. One of the most common error which I have seen is as below.

    Msg 35250, Level 16, State 7, Line 1
    The connection to the primary replica is not active. The command cannot be processed.

    This error mostly appears when we try to join the database to availability group. by UI, T-SQL or PowerShell.

    SSMS UI:

    While trying to create new Availability Group, we might received below and “join” step would fail.

    image

    Here is the message in text format.

    TITLE: Microsoft SQL Server Management Studio
    ——————————
    Joining database on secondary replica resulted in an error.  (Microsoft.SqlServer.Management.HadrTasks)
    ——————————
    ADDITIONAL INFORMATION:
    Failed to join the database ‘Production’ to the availability group ‘ProductionAG’ on the availability replica ‘SRV2’. (Microsoft.SqlServer.Smo)
    For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft+SQL+Server&ProdVer=11.0.2100.60+((SQL11_RTM).120210-1917+)&EvtSrc=Microsoft.SqlServer.Management.Smo.ExceptionTemplates.FailedOperationExceptionText&LinkId=20476
    ——————————
    An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.ConnectionInfo)
    ——————————
    The connection to the primary replica is not active.  The command cannot be processed. (Microsoft SQL Server, Error: 35250)
    For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%20SQL%20Server&ProdVer=11.00.2100&EvtSrc=MSSQLServer&EvtID=35250&LinkId=20476
    ——————————
    BUTTONS:
    OK
    ——————————

    T-SQL:

    image

    Msg 35250, Level 16, State 7, Line 1
    The connection to the primary replica is not active. The command cannot be processed.

    PowerShell:

    Add-SqlAvailabilityDatabase -Path "SQLSERVER:\SQL\SRV2\DEFAULT\AvailabilityGroups\ProductionAG" -Database "Production"

    **********************************
    Add-SqlAvailabilityDatabase : The connection to the primary replica is not active.  The command cannot be processed.
    At line:1 char:28
    + Add-SqlAvailabilityDatabase <<<<  -Path "SQLSERVER:\SQL\SRV2\DEFAULT\AvailabilityGroups\ProductionAG" -Database "Production"
         + CategoryInfo          : InvalidOperation: (:) [Add-SqlAvailabilityDatabase], SqlException
         + FullyQualifiedErrorId : ExecutionFailed,Microsoft.SqlServer.Management.PowerShell.Hadr.AddSqlAvailabilityGroupDatabaseCommand
    **********************************

    Solution

    I have always suggested them to start looking at errorlog and check what is the error which most of the DBA have reported.

    2014-06-30 17:29:33.500 Logon        Database Mirroring login attempt by user ‘HADOMAIN\SRV1$.’ failed with error: ‘Connection handshake failed. The login ‘HADOMAIN\SRV1$’ does not have CONNECT permission on the endpoint. State 84.’.  [CLIENT: 192.168.1.11]

    In above message, HADOMAIN is my domain name and SRV1 is the host name of SQL Server hosting primary replica.

    Here is what have solved the issue for them.

    • Change SQL Server service account to a domain account and grant connect permission to the instances. If we are using different domain accounts on each replica then we need to add service accounts of all secondary replicas to primary replica logins.
    • If we are using non domain account (like LocalSystem or NT Service\MSSQLServer account) as service account and we can’t change it to domain account then we need to create machine accounts as login and grant connect permission. In our case machine name is SRV1 so machine account is HADOMAIN\SRV1$ (notice that $ at the end is a computer account)

       

      create login [HADOMAIN\SRV1$] from windows;
      go
      grant connect on endpoint::Mirroring to [HADOMAIN\SRV1$];
      go

    Note: Endpoint Name might be different. We need to pick as per below image: If you have configured via UI earlier, it should be Hadr_endpoint

    image

    If you are running firewall, please make sure that port used by availability group is not blocked. We can easily find port using below command:

    SELECT
    te.port AS [ListenerPort],
    te.is_dynamic_port AS [IsDynamicPort],
    ISNULL(te.ip_address,'''') AS [ListenerIPAddress],
    CAST(case when te.endpoint_id < 65536 then 1 else 0 end AS bit) AS [IsSystemObject]
    FROM
    sys.endpoints AS e
    INNER JOIN sys.tcp_endpoints AS te ON te.endpoint_id=e.endpoint_id
    
    image

    Make sure that you have added exception for the port in firewall.

    This is already documented in books online

    {

    If any server instances that are hosting the availability replicas for an availability group run as different accounts, the login each account must be created in master on the other server instance. Then, that login must be granted CONNECT permissions to connect to the database mirroring endpoint of that server instance.

    }

    Hope this would help you.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle
  • Posted in AlwaysOn, Troubleshooting | Tagged: , , | Leave a Comment »