violetkaipa - Fotolia

IT resource management and virtual server housekeeping tips

Without regular monitoring, sprawl and overallocation can waste expensive compute and storage resources

While improved consolidation is one of the most important advantages to server virtualization, that doesn't mean virtualized environments are always as resource efficient as they should be. Even with proper VM sizing techniques and IT resource management strategies, some level of resource waste is inevitable. For example, VMs that have been migrated from physical servers can bring unneeded overhead and VMs that reach the end of their lifecycle could go unnoticed. Periodic housekeeping is an important part of an administrator's job to ensure compute and storage resources are not going to waste.

Storage cleanup

Servers tend to collect a lot of stale data, which is normally the administrator's fault. Often, admins copy service packs and installations to the local server, usually without removing them. These are often found in dedicated folders on the C drive or, worse yet, on the desktop in the admin's local server profile. Remember that deleting the data simply moves it to the recycle bin, so be sure to empty those for each profile.

While a few gigabytes per server doesn't seem like a lot, the total amount of wasted space across multiple servers can be staggering. Besides affecting your VM sizing, this storage dead weight can affect backup performance and capacity.

CPU and memory cleanup

VMs built from templates are typically resource efficient, as long as the guest tools are installed and updated. This will be most apparent in a physical-to-virtual migration. As old machines are imported, they need to be cleaned up before being returned to production. You'll want to uninstall vendor-specific hardware drivers and applications, because any service or driver looking for hardware that is no longer present has the potential to consume CPU or memory resources. While you may find and uninstall vendor-specific items in the programs section, don't forget to look at the Windows services and identify any hardware-specific services. Key giveaways of services that should be removed include services that fail to start or ones that place notifications in the event logs.

Stay vigilant and use monitoring tools

The IT resource management tools that come with enterprise virtualization environments typically provide monitoring capabilities for metrics that more traditional server monitoring tools can't see. One of the key things here is the ability to see what's in use versus what's allocated.

As you move more applications into the virtual space, are you continuing to use these traditional non-VM monitoring tools? Some environments may, in fact, monitor the same server with multiple tools because each group (e.g., networking, operations, server admins and so on) has unique needs. In reality, the tools are monitoring the same items via different means. With so many tools trying to do the same thing, it's even possible that an organization could inadvertently create a denial-of-service situation on its own servers. Keep in mind: Monitoring is a good thing; over-monitoring is not.

Addressing overallocation

Memory and CPU resource caps, combined with disk thin provisioning, have been used with good intentions, but the results are not always helpful. In fact, in many environments, overprovisioning masks the true problems.

This ability to over-allocate deceives application and server owners into thinking they have a lot more resources than they really do. While some would say this is harmless, what happens when they ask for another server? Application owners may continue to "go big" rather than ask for what is truly needed.

If you continue to support this type of overprovisioning, you could end up with an environment that is so overallocated you wouldn't know when you need additional resources. While offering standard-sized VM categories can help to trim some of these excessive requests, virtual admins and application owners need to cooperate to put a stop to these "tiny" lies.

If you find VMs that are overallocated -- and you will -- you will need to adjust them back to settings that fall within the categories you're creating. Keep in mind that removing resources from an application owner's VMs qualifies for hazard pay in most IT environments.

All of the charts and graphs you prepare won't necessarily help to persuade application owners and managers to let you remove resources, even if they could be added back if needed. This struggle requires manager-level support.

To bolster your argument, base your bell curve on what is being used versus what is allocated. A good monitoring tool will be able to show you this data. And with that information, it's likely that you'll be able to show how one collection of servers is classified in that large category -- even though actual usage demonstrates that those resources are unnecessary.

As you begin to make adjustments to your VMs, be sure to follow established baselines and monitoring windows to make these adjustments. Monitoring a moment in time does not provide enough data points to make accurate adjustments. Gather five to six weeks of performance data on a VM before making any adjustments. This allows you to cover both the start and end of a month, times when spikes in resource demand would be expected.

As you evaluate the baselines for each VM, ensure you are excluding any backup or antivirus performance spikes. Those should be looked at as the exception rather than the norm. Baselines are the foundation for possible changes to a production VM, so an abundance of data will be only a benefit.

Getting it done right

The reduction of resources is likely to be somewhat personal to the application owner, so you may need to take additional steps to ensure a smooth transition. Making a change is by far the most challenging aspect of sizing an environment.

The reduction of CPU, drive size or memory should be the only adjustment made during the downtime. While it may be tempting to update patches or the application itself, it's best to do those tasks in a separate outage window. Otherwise, by combining multiple adjustments, you're inviting users to blame any resulting performance problems on the reduction in resources. It's likely that a software patch or upgrade could hurt performance, but people will want to point to the resource adjustments. This perception will be difficult to overcome.

As you progress through the journey of right-sizing your virtual environment through the use of baselines, standard-sized VM categories and cleanup, another key task awaits you: the sprawled virtual machine. Cleaning up is great, but keep a sharp eye out for abandoned or excessive VMs. Right sizing isn't simply about what is inside the VM; it can also be the VM itself.

Excessive and abandoned VMs are inevitable in a growing, changing organization, so this is a problem that will recur. It needs to be addressed continuously.

Virtual environments have given us unlimited choices. Now it's time to pull in the reins a little and optimize what we have. Doing this means we're preparing the way for a cost-effective, agile and successful IT operation.

Next Steps

Reduce resource waste with an IT chargeback system

Make the most of the VM resources you have

Reclaim expensive SSD storage by rethinking swap files

Dig Deeper on Server consolidation and improved resource utilization