Today, in this post I would like to share an interesting case that happened back in late 2015, it was Always On HADR_AG_MUTEX Waittype caused reconfigure statement in the hung state. One of our clients want us to change the few settings on the few servers, the settings include changing Max Degree of Parallelism, Max Server Memory, Enable DAC and Enable Optimize for Ad-hoc workloads. The settings were successfully applied on all the server except one, and that one was the Primary node of an Always On group.
-- Enable Advanced Options EXEC sp_configure 'show advanced options', 1 GO RECONFIGURE GO -- Set Max Degree of Parallelism to 3 EXEC sp_configure 'max degree of parallelism', 3 -- Set Max Server Memory to 12GB EXEC sp_configure 'max server memory', 12288 -- Enable Optimize for Ad-hoc Workloads EXEC sp_configure 'optimize for ad hoc workloads', 1 -- Enable DAC connection EXEC sp_configure 'remote admin connections', 1 RECONFIGURE GO
How did we resolved HADR_AG_MUTEX Waittype caused reconfigure statement in the hung state
When I executed these statements it keeps on running and did not complete, I waited for 2 minutes before I reviewed what was happening. The SPID that I was running the above statements was in suspended state, reviewed what was happening with that very SPID and found that it has wait type HADR_AG_MUTEX. This wait type is seen when a thread is waiting to access critical section of the code for examining or changing configuration according to the excerpts from SQLSkills article, in our case it the case as well. I did kill the session after collecting all the data that I would need and then I worked with Colleague to establish RCA.
Here is what out finding was, this server was built on top of Windows Server 2008 R2 and it was hitting the but reported in KB 2777201 and KB 2699013. We have reproduced the issue and took manual stack dump, after reviewing the stack dump what we have found is that a function sqlmin!SOS_RecursiveMutex::Wait+0x78 is what it was waiting on, and hence HADR_AG_MUTEX Waittype caused reconfigure statement in the hung state.
What we have suggested is to apply the Windows Server 2008 R2 fix and SP2 for SQL Server 2012. We did run the configuration change again after the said fix is applied and the changes were successfully applied. This is how we have resolved the issue of HADR_AG_MUTEX Waittype cause reconfigure statement in the hung state.
Let me know if you have witnessed something similar to this and how have you resolved it.