At times, virtualization management can feel like a nightmare.
We’ve all heard the horror stories that involve inexplicable downtime, patching problems and disgruntled end users. These virtualization management tales may not be as gruesome as cheesy horror flicks. (Leprechaun in the Hood, anyone?) But they are ghastly, nonetheless.
With Halloween in mind, we asked members of our Server Virtualization Advisory Board the following question:
What’s the scariest virtualization-related story that you’ve heard or experienced?
My scariest story is one that I’ve experienced first-hand and it centers around VMware’s new vSphere Storage Appliance.
The vSphere Storage Appliance (VSA) is intended for very small environments that want highly available virtual machines, but don’t have a storage area network (SAN). Limited to just a two- or three-server configuration, the VSA is intended as a green-field installation for inexperienced virtualization administrators.
While this small-environment solution might appear useful at first blush, be afraid. Be very afraid.
Sporting an array of jaw-dropping limitations that include no memory overcommit, no ability to add storage, and a 75% storage-redundancy overhead, you are forced to wonder what VMware was thinking during this product’s development. In fact, this product’s limitations are so profoundly bad and its cost so stunningly high, most small shops would be better off just buying a SAN.
My advice: If you see the VSA ambling up your sidewalk on Halloween night, do yourself a favor and shut off the porch light.
There are a lot of scary things that can happen in your virtual infrastructure, and they almost always translate into money.
How much money, though, is a relative thing. If your database server that manages all your sales transactions is not available because the virtual infrastructure is down, then you are losing money. If your source code or continuous integration server is down and 1,000 developers cannot check their code, you’re losing money.
A good portion of these outages are because of human error. I can remember when a storage admin mistakenly deleted a data store with more than 40 virtual machines (VMs), instead of just the snapshots. And just like that, the VMs are gone. Poof.
In this specific case, the restore was 100% successful. It just took a frantic restore from backup, prioritizing the restores and countless hours of waiting for the data to come back.
So, your VMs will not go boo in night but they can give you a lot of grey hairs.
How would you react if some of your VMware ESXi hosts were suddenly running Windows 7, instead? If you’re not careful with automation technologies, such as PXE-based boot for the installation of operating systems, it could happen.
Here’s a frightful virtualization management story that I was told: After remotely patching and rebooting a couple of ESXi hosts, the virtualization admins began to realize that something was wrong. Server boot times are always way too long, but this time, it seemed to take forever.
The Lights-Out-Management tool revealed that the servers finished booting, but they were running Windows 7! It turns out that the newly set up Windows Deployment Services was configured to automatically install Windows 7 on everything that would boot via the network. Of course, the intention was to reinstall Windows 7 on the client computers after a reboot. But the servers were also configured to boot via PXE, resulting in the unintentional installation of Windows 7 on the ESXi hosts.
Luckily, only a portion of the ESXi hosts were affected. And rebuilding an ESXi host is pretty straightforward, so the systems were quickly restored.
This goes to show that before enabling any sort of automation tool -- and, perhaps, any sort of automatic deployment -- make sure that it targets only the intended recipients. With the new vSphere 5 Auto Deploy feature, it’s particularly important to double check that you don’t have conflicting services in your data center, or you may quickly get in the same predicament!
Greenpages Technology Solutions
Through the years, I’ve heard of many stories and witnessed IT personnel causing data center disasters. Some of these mistakes were the result of data center operators who were not properly trained or an environment that wasn’t properly designed, implemented, managed or optimized. As more complex and mission-critical applications are virtualized, it’s more important than ever to properly plan to avoid these preventable errors.
Prior to virtualizing complex and revenue-producing applications, make sure that you properly assess, test and plan. If a company does not have extensive experience in advanced virtualization deployments (not just a test and development environment, for instance), it should always find a competent, experienced and trustworthy consultant. A consultant will also know how to introduce complementary technologies that will optimize and increase the robustness of an infrastructure.
An incompetent deployment may also have other ramifications to an infrastructure’s storage, backup systems, security and networking.
The bottom line: If you want to avoid a virtualization horror story, make sure that you either have the talent and experience on staff or find a partner that does.
This was first published in October 2011