As many of you might know that currently I have been supporting SQL Server in Azure Virtual Machines. Most of the concepts remains same but one of the common issue which was reported by many customer is about Always On listener connectivity. In this blog we would talk about the possible causes of SQL Server Always On Availability Group Listener not working in Azure SQL VMs.
The most common issue is that the listener got created successfully but it connects only from the server which is primary replica. Note that this issue will not occur in on-premise because there is no ILB needed.
Possible Cause # 1
Many of our customers have come to Microsoft via Azure support ticket, after following below article. (This is the most common issue)
Configure a load balancer for an Always On availability group in Azure
Is there something wrong with the article? No, there is nothing wrong there. Its mostly a oversight when customer miss running below script.
$ClusterNetworkName = "<MyClusterNetworkName>" # the cluster network name (Use Get-ClusterNetwork on Windows Server 2012 of higher to find the name) $IPResourceName = "<IPResourceName>" # the IP Address resource name $ILBIP = "<n.n.n.n>" # the IP Address of the Internal Load Balancer (ILB). This is the static IP address for the load balancer you configured in the Azure portal. [int]$ProbePort = <nnnnn> Import-Module FailoverClusters Get-ClusterResource $IPResourceName | Set-ClusterParameter -Multiple @{"Address"="$ILBIP";"ProbePort"=$ProbePort;"SubnetMask"="255.255.255.255";"Network"="$ClusterNetworkName";"EnableDhcp"=0}
The parameters are explained in the article which I mentioned in the beginning. This is the script which maps probe port and IP which you have defined in the portal. Here is the PowerShell script which I use to figure out above (Thanks to my team member Ghufran)
#-- Probe Check Script: #-------------------------------------------------------------- Clear-Host Get-ClusterResource |` Where-Object {$_.ResourceType.Name -like "IP Address"} |` Get-ClusterParameter |` Where-Object {($_.Name -like "Network") -or ($_.Name -like "Address") -or ($_.Name -like "ProbePort") -or ($_.Name -like "SubnetMask")}
Above script shows us the output which can be used to fill in first PowerShell script.
Possible Cause # 2
Second possible cause if that Load Balancing rules are not configured correctly. This information is also mentioned in the article.
Most of the time two values are set incorrectly “Session persistence” and/or “Floating IP (direct server return)”. If we choose any value other than what mentioned in above image, you would get connectivity issue from passive/other nodes.
Possible Cause # 3
In few customer cases I have seen that they run PowerShell for “Cluster IP Address” also. There is no real need to
Possible Cause # 4
In few cases, we have also seen that probe port was not opened in firewall. You should enable 1433 (SQL Server listening Port), 5022 (Always On Endpoint port), 59999 (Probe Port). If you are using any non-default port then take care of them in firewall.
Possible Cause # 5
If you are using NSG in Azure then make sure ports mentioned in #4 is are open.
This is the list I have compiled based on support cases reported. I will keep adding more items as and when they are reported.