0

How I discovered that my clusters had issues, I have two SQL servers VMs, if you went into them, one of their drives would show up but without any statistics, how big it is, or anything. When I went to computer management, I saw the drive was in RAW state

enter image description here

The machine also has some lingering checkpoints

enter image description here

VM with multiple lingering checkpoints • Pta-K2Prd-DB1-V (Node 2) o VM has multiple lingering checkpoints. Can't migrate to any other cluster node until lingering checkpoints are merged/cleared

As this VM is stuck on Node 2, I could not do patching and firmware on this node. It is a critical VM as it is the SQL of our K2 system. There is a second SQL VM for K2 - Pta-K2Prd-DB2-V. If they are properly HA configured we can maybe contact the K2 system owners so that we can work on Pta-K2Prd-DB1-V, as DB2 will stay online.

I finished the investigation of the SQL cluster (PTA-SQL-CL) at 16:30 on Sunday, so I didn’t have time to investigate the Prod cluster (PTA-VM-CL)

Move to cluster Operation failed: enter image description here

** Solutions:** Was to restart the server, not shut it down and start it up but to restart the server, which would bring the disk out of Raw mode. During the restart, I could see that there was a patch that was installed, which I found a bit funny... Because I didn't really install patches.

And then when I decided to add patches on all the nodes, I found that I couldn't drain the 2 of my nodes, eventually, it did drain and I could load the patches on them

When I tried draining node 2, the problem with node 2, there was one machine that I could not drain or migrate to another node. When I go to the settings of the machine you will see it shows a funny hard drive

enter image description here

kindly Advice

Gift
  • 97
  • 1
  • 12

0 Answers0