Took Down Virtual Environment for Storage Maintenance


I had a storage alarm on my storage array start screaming today because it lost power to one of its redundant power supplies plus had slight temperature high reading on CPU #1. I found out that the power supply without any power was going to one of my UPS’ I utilize for my workstations and not my servers so switched that to the proper UPS and that worked.

That power supply was not connected to the protected battery side of that UPS outlets. I will look at why the UPS switched modes on me later but since its only for workstations and nothing else was plugged into that unprotected side of the UPS.

I decided to take the storage arrays down and blow the dust out of them since they have had a very long up time of several months. I had one of the fiber cables hang up and I had to get a small screwdriver to disconnect it, then putting them back in I had to realign the rails of my rack to make them go back in and not knock the cove back of my main storage chassis and cause an intrusion alert.

I found it was really good to login on IPMI on my Supermicro motherboard to view the sensor data before and after cleaning up my chassis. Just looking now there’s a wide disparity between my CPU #1 and #2 temperatures so I will plan a maintenance window and make sure that the heat sinks are seated properly and reapply heat sink compound. I was able to observe the fans look fine and blew the fins of my heat sinks out so the airflow looks fine.

Once the storage array came back up I checked the connections were on the proper networks and XCP-ng could see the paths to my storage and brought my VM’s back up as you can see since this site is back up. My music stream stayed up since the streaming server and the relay server are 2 separate physical servers. That’s all for tonight!

My main storage array systems internals.

Leave a Reply

Required fields are marked *.