"A virtualized environment has more managed entities -- all the same OSes (operating systems) as the original plus the VM hosts," said CiRBA CTO Andrew Hillier, co-founder and CTO of CiRBA, a data center intelligence software firm. This causes "an increase in the workload on administrators, not a decrease."
Changing your IT practices can help mitigate the increased management burden posed by VMs and blades, said Hillier in this interview. Failure to change can cause some big problems. SearchServerVirtualization.com caught up with Hillier before his participation in a panel discussion,
SearchServerVirtualization.com: Do IT managers need to rethink their data center management policies and practices when moving to virtualization?
Andrew Hillier: Virtualization can theoretically be done in such a way as to minimize the impact on existing data center management processes, but doing so often fails to take advantage of the true potential of the P2V (physical-to- virtual) paradigm shift.
For example, by treating VMs the same way you would physical systems, it is possible to provision and deploy them in a manner that largely resembles that of physical gear, once you get past the bare- metal steps. But this sells the technology short; it can be taken so much further. Many progressive environments are looking at the technology to completely revamp the dev/test cycle, code promotion, performance test and disaster recovery processes, making them more nimble and automated. By embracing the paradigm and using new-school thinking around data center management, there are much greater benefits that can be had.
What IT management systems are affected when you virtualize?
Hillier: Incident and problem management: common mode failures can cause a single fault -- e.g., a hardware problem -- to impact multiple virtual machines. If the event console in use is incapable of root cause analysis then this may even be reported as multiple faults. On the resolution side, maintenance windows become correlated as well, and if VMs are combined in certain ways it may be difficult to find a window to fix the problem.
In capacity planning, tools that are incapable of attributing utilization to the VMs and/or normalizing the utilization to a CPU-level measure become very difficult to use in virtual environments.
When measuring the percentage (of) usage of a VM, the first question that comes up is 'a percentage of what?' If the VM is only getting the resources it needs from the host system, then these numbers become irrelevant.
In network provisioning , many virtualization technologies allow systems to be created and added to the network with no network provisioning steps. By sharing an existing NIC, [the VM] can come in under the radar, which can sometimes help make app deployment more responsive but more often simply serves to make a dangerous situation where there is little accountability for what is going on the network.
What changes does virtualization bring to IT budgeting or financial issues?
Hillier: Financial management and chargeback becomes more difficult. Then there's the question of how to charge idle time; if the system is idle, then most chargeback models will incur no charges, but someone has to pay for the system.
Does virtualization make it absolutely necessary for IT staffs to have expertise in administering multiple operating systems?
Hillier: Virtualization can actually create problems since administrators start wearing too many hats. This can create complex situations where the administrators of systems also manage the VMs, and because the VM disk images are readable to the administrator, the lines of separation break down. Access control boundaries and compliance rules are easier to maintain if administrators deal with their boxes at the OS level and separate administrators manage the VMs.
Are you seeing virtualization adopters having trouble with documenting and finding their VMs?
Hillier: Because VMs have a low barrier to creation, there is a real danger of VM proliferation. Furthermore, loose processes in this area can mean that it is impossible to tell who created a VM, why it exists, and if it is being used. Stronger processes and naming conventions are helpful in combating this, and tools that can discover and audit VMs to account for their utilization, software inventory, etc. are indispensable when managing virtual environments.
Early blade adopters have told me about a lot of problems they've had with blades, particularly with ease of management. Why?
Hillier: Blades introduce an even higher level of sharing that pure virtualization, and with this comes more complexity in management. When sharing rack-level NICs, the analysis of workload patterns becomes even more important.
Correlated hardware maintenance windows are also a consideration. It is imperative that there be a time window when all the applications on the rack can be brought down to perform hardware maintenance.
Have management tools evolved to support blade servers?
Hillier: A big shift for the positive has been the introduction of more advanced analytical tools that allow the planning for and management of blade racks to take into account the commonalities and sharing paradigm, reducing the risk of using these technologies.
Many IT managers tell me that virtualizing on rack servers works just fine, and they have no desire to switch to blades. What are they gaining and missing in this approach?
Hillier: Virtualizing onto rack-mount servers decreases the level of sharing and thus simplifies management in some ways. NICs are shared by fewer images, hardware failures affect fewer systems, and the hardware spec can be tailored more specifically to each set of shared applications. The downside is the footprint and cost of these servers relative to blades, but that may not be a significant issue in all cases.
Have you worked on any virtualization projects that involved blades?
Hillier: We have been involved in analysis of (moving physical) servers onto blades, and have witnessed some very interesting situations.
In one case, the built-in rules in the CiRBA product detected token ring cards in a number of the targeted servers. This was an absolute show stopper, as that communication technology was not supported by the destination hardware spec, and even if it was, it was not supported by the virtualization technology.
In another case there were daughterboards related to specialized hardware (interactive voice response systems) that could not be virtualized. More commonly, we see cases where our NIC anti-affinity rule detects that a certain virtualization plan will combine systems that touch too many networks and would require the purchase of additional NICs in order to support. This [NIC anti-affinity] rule is also helpful in non-blade cases, as that situation can arise from running improperly-matched applications on any type of server.