Patching SQL AlwaysOn with Orchestrator is a bit different than the patching Hyper-V clusters with Orchestrator. If you’ve looked up the community runbook for cluster patching from Neil Peterson found here you’ll notice he does a great job of outlining the solution. He breaks down the processes to into manageable runbooks.
Before we get to the SQL patching additions to this process, you may want to add logging to the solution that Neil provided and to quick migrate the VM’s off if the Live Migrations aka cluster drain fails. During my testing of his solution I found that my cluster drain would randomly fail from time to time. I tested the solution on a 5-node 2012 and a 5-node 2012 R2 Hyper-v cluster. While the clusters were completely healthy and would appear to be functioning correctly the occasional drain would fail, thus causing the patching process to fail. Keep in mind the quick migrate will pause the VM prior to moving it to another node, which is different than the live migration which occurs without service interruption.
Here is my addition to the suspend cluster RB.
PowerShell for the quick migration:
Ok on to the SQL AlwaysOn Patching. First, if you’re unfamiliar with SQL AlwaysOn, read this for an overview. Additional information on performing a rolling upgrade for SQL availability groups found here. Now because we’re SQL Server Failover Cluster Instance (FCI) and Availability groups, we wanted to move the primary owner of the AO Group to another node of the FCI prior to patching the node; easy right? We’ll just drain the node…
Draining the cluster node is not the way to move the SQL AOGroup. You do not want to manage SQL AlwaysOn Groups from Cluster Manager, here’s why:
Do not use the Failover Cluster Manager to manipulate availability groups, for example:
- Do not use the Failover Cluster Manager to move availability groups to different nodes or to fail over availability groups. The Failover Cluster Manager is not aware of the synchronization status of the availability replicas, and doing so can lead to extended downtime. You must use Transact-SQL or SQL Server Management Studio.
- More info found here.
Luckily there’s some PowerShell we can use to accomplish this task. First you want to find the current Primary Owner node of the AO Group, then find somewhere to place the AO Group. The way I accomplished this was to use the PowerShell script listed below to output the results of the AO Group Name, the current owner Node of the AO Group, and another active Node to move the AO Group while we patch. All of my results are in a delimited list so I then use the data manipulation integration pack to parse the results.
Get Info Script:
The RunBook used to get the AO Group information and write to a database.
Because we have returned data on the data bus regarding the AO Group info, you can use this script/runbook later in the solution to move the AO group to a different node.
Switch Availability Group Script:
Runbook leveraging the script:
Now you can use Neil’s runbook for the actual patching process.