Help: SQL Server

Sharing my knowlege about SQL Server Troubleshooting Skills

  • Blog Stats

    • 1,387,443 hits
  • Select GETDATE()

    May 2018
    M T W T F S S
    « Apr    
  • Advertisements

Posts Tagged ‘AlwaysOn’

Help: SQL Server Always On Availability Group Listener not working in Azure SQL VMs (Using Internal Load Balancer – ILB). How to troubleshoot?

Posted by blakhani on April 17, 2018

As many of you might know that currently I have been supporting SQL Server in Azure Virtual Machines. Most of the concepts remains same but one of the common issue which was reported by many customer is about Always On listener connectivity. In this blog we would talk about the possible causes of SQL Server Always On Availability Group Listener not working in Azure SQL VMs.

The most common issue is that the listener got created successfully but it connects only from the server which is primary replica. Note that this issue will not occur in on-premise because there is no ILB needed.

Possible Cause # 1

Many of our customers have come to Microsoft via Azure support ticket, after following below article. (This is the most common issue)

Configure a load balancer for an Always On availability group in Azure

Is there something wrong with the article? No, there is nothing wrong there. Its mostly a oversight when customer miss running below script.

$ClusterNetworkName = "<MyClusterNetworkName>" # the cluster network name (Use Get-ClusterNetwork on Windows Server 2012 of higher to find the name)
$IPResourceName = "<IPResourceName>" # the IP Address resource name
$ILBIP = "<n.n.n.n>" # the IP Address of the Internal Load Balancer (ILB). This is the static IP address for the load balancer you configured in the Azure portal.
[int]$ProbePort = <nnnnn>

Import-Module FailoverClusters

Get-ClusterResource $IPResourceName | Set-ClusterParameter -Multiple @{"Address"="$ILBIP";"ProbePort"=$ProbePort;"SubnetMask"="";"Network"="$ClusterNetworkName";"EnableDhcp"=0}

The parameters are explained in the article which I mentioned in the beginning. This is the script which maps probe port and IP which you have defined in the portal. Here is the PowerShell script which I use to figure out above (Thanks to my team member Ghufran)

#--  Probe Check Script:
Get-ClusterResource |`
Where-Object {$_.ResourceType.Name -like "IP Address"} |`
Get-ClusterParameter |`
Where-Object {($_.Name -like "Network") -or ($_.Name -like "Address") -or ($_.Name -like "ProbePort") -or ($_.Name -like "SubnetMask")}

Above script shows us the output which can be used to fill in first PowerShell script.

Possible Cause # 2

Second possible cause if that Load Balancing rules are not configured correctly. This information is also mentioned in the article.


Most of the time two values are set incorrectly “Session persistence” and/or “Floating IP (direct server return)”. If we choose any value other than what mentioned in above image, you would get connectivity issue from passive/other nodes.

Possible Cause # 3

In few customer cases I have seen that they run PowerShell for “Cluster IP Address” also. There is no real need to

Possible Cause # 4

In few cases, we have also seen that probe port was not opened in firewall. You should enable 1433 (SQL Server listening Port), 5022 (Always On Endpoint port), 59999 (Probe Port). If you are using any non-default port then take care of them in firewall.

Possible Cause # 5

If you are using NSG in Azure then make sure ports mentioned in #4 is are open.

This is the list I have compiled based on support cases reported. I will keep adding more items as and when they are reported.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle
  • Advertisements

    Posted in Always On, AlwaysOn, Azure ILB, listener, Microsoft Azure | Tagged: , , , , , | 1 Comment »

    AlwaysOn – How many databases can be added in Availability Group? Any hard limit?

    Posted by blakhani on April 14, 2015

    This is one of the common question asked. This blog has list of resources which can be useful in getting answer. First lets look at books online.

    Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server)

    Maximum number of availability groups and availability databases per computer: The actual number of databases and availability groups you can put on a computer (VM or physical) depends on the hardware and workload, but there is no enforced limit. Microsoft has extensively tested with 10 AGs and 100 DBs per physical machine. Signs of overloaded systems can include, but are not limited to, worker thread exhaustion, slow response times for AlwaysOn system views and DMVs, and/or stalled dispatcher system dumps. Please make sure to thoroughly test your environment with a production-like workload to ensure it can handle peak workload capacity within your application SLAs. When considering SLAs be sure to consider load under failure conditions as well as expected response times.

    In general, the more databases that are replicated and the more secondary replicas that exist – the more worker threads and more memory that will be consumed just to have the AlwaysOn infrastructure.  As the text above indicates, there is no enforced limit, but the more you have the more worker threads and memory will be needed.   If there are insufficient worker threads you will probably see error messages in the SQL Error log similar to:

    “The thread pool for AlwaysOn Availability Groups was unable to start a new worker thread because there are not enough available worker threads.  This may degrade AlwaysOn Availability Groups performance.  Use the "max worker threads" configuration option to increase number of allowable threads.”

    If starved for memory, you could see many different error messages – that may or may not look like they relate to AlwaysOn. One possible message could be:

    “Could not start the AlwaysOn Availability Groups transport manager. This failure probably occurred because a low memory condition existed when the message dispatcher started up. If so, other internal tasks might also have experienced errors. Check the SQL Server error log and the Windows error log for additional error messages. If a low memory condition exists, investigate and correct its cause.”

    Here are other blogs which explain the number of threads in worker pool to support availability group.

    AlwaysOn – HADRON Learning Series:  Worker Pool Usage for HADRON enabled databases

    Monitoring SQL Server 2012 AlwaysOn Availability Groups Worker Thread Consumption

    Hope this helps.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Posted in AlwaysOn, SQL Server 2012, SQL Server 2014 | Tagged: , , , | Leave a Comment »

    Tips and Tricks : Creating Linked Server to an Availability Group Listener with ReadOnly routing

    Posted by blakhani on September 30, 2014

    In my recent interaction with customer, I was asked if linked server can be created to an availability group listener? My immediate response was “Yes”. Next question was immediately thrown at me that if they can redirect this listener to secondary replica? This time my little delayed response was – Yes. Provided we use proper connection string. Here is the linked server creation script

     EXEC sp_addlinkedserver @server = N'MyLinkedServer'
        ,@provider = N'SQLNCLI'
        ,@srvproduct = N'SqlServer'
        ,@datasrc = N'MyListener'
        ,@provstr = N'ApplicationIntent=ReadOnly'

    …and it was not working as they expected.

    Whenever we want to use routing of read-only connection via listener and connections are not going to secondary replica then below checklist should help.

    Here is the checklist which I have created.

    • Verify that we are connecting to Listener Name.
    • Verify that the Read-Only routing list (READ_ONLY_ROUTING_LIST) is defined.
    • Verify that the Routing URL (READ_ONLY_ROUTING_URL) of each instance has a proper FQDN and port combination.
    • Verify that ApplicationIntent is specified in the connection string.
    • Verify that the Sync_State is SYNCHRONIZED or SYNCHORNIZING for the secondary replica
    • Verify that the secondary replicas are set to allow connections
    • Verity that the initial catalog is provided in connection string.

    If complete checklist is followed, it should resolve most of the routing problems. Most common mistake which I have seen is to miss creating routing URL and routing list. This may be due to the fact that there is no User Interface available and it has to be done by T-SQL.

    Coming back to original question – I asked them to use proper connection string while creating listener. They followed the checklist and they have missed last item the initial catalog. Here is my version of the script.

    USE [master]
    EXEC master.dbo.sp_addlinkedserver @server = N'MyLinkedServer'
        ,@srvproduct = N'SQL'
        ,@provider = N'SQLNCLI11'
        ,@datasrc = N'MyListener'
        ,@provstr = N'Integrated Security=SSPI;Initial Catalog=Production;Data Source=MyListener;ApplicationIntent=ReadOnly'
        ,@catalog = N'Production'

    How would you test whether the connection is working fine or not? That’s simple.

    select * from openquery(MyLinkedServer,'select @@servername')


    If Server name is returned as secondary replica then routing is working perfectly fine. If you are getting primary replica then routing has a problem. The best way to troubleshoot further would be to use SQLCMD and follow the checklist provided earlier in this blog.

    sqlcmd -S MyListener -E -d Production -K ReadOnly

    Hope this helps!

    P.S. Checklist is explained in detail in my book SQL Server 2012 AlwaysOnPaperback, Kindle

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Posted in Tips and Tricks | Tagged: , , , , , | 6 Comments »