Are there any virtual server tuning strategies to prevent crashes?

Are there configuration options that help stabilize server performance and allow it to degrade gracefully rather than crash outright?

Organizations pursue virtualization as an essential means of improving server hardware utilization and workload flexibility. But those benefits are lost when poor configuration choices and overlooked virtual server tuning opportunities conspire to compromise server performance. IT architects and system administrators must look for every means to streamline setups in order to maximize server utilization – while still optimizing the performance and reliability of each workload.

Virtual server tuning can be a complex issue with a multitude of considerations and tradeoffs. System administrators generally try to configure servers such that the system can accommodate modest or temporary spikes in utilization, but there is no single setup or configuration that will prevent servers from crashing or becoming unavailable. For example, excess memory use will lead to more page file swapping and over-subscribing the available processors can slow workload behavior – both detrimental to performance – often causing subsequent requests to queue up and slow performance even further, causing a death spiral that eventually results in a system crash.

 One way to help protect the server’s stability is to limit the number of concurrent connections it is allowed to handle. Excess connections are queued or ignored, but that is often preferable to a server crash. The trick is to set the number of concurrent connections high enough where the server is well-utilized, but low enough where it is not overtaxed. It’s a delicate judgement call that often takes some trial and error to achieve, and is tied to the choice of multi-processing module (MPM) used with the server which determines how processes are created. For example, multiply the number of child processes by the amount of memory used by each process, and verify that the available memory is not exhausted. However, the actual resource demands may be more or less depending on what workload demands the server is actually handling.

 The configuration options that dictate process creation and termination can also generate significant overhead for some servers. For example, a configuration like MaxRequestsPerChild on an Apache HTTP server workload might use a default like zero which allows processes to continue indefinitely. If this setting is configured to a low number, it could force unexpected turnover (and corresponding overhead) in processes starts and stops. If you do put limits on this type of setting, use a large number instead and slowly adjust the number downward from there.

 Also pay attention to any settings that remove idle server processes – especially on servers with erratic or variable traffic demands. This can cause a lot of turnover as idle processes are removed; only to be quickly recreated again. The general approach is simply to set the maximum number of processes only and leave it (don’t worry about recovering the resources of idle processes).

 No single server configuration is right for every operating system, hypervisor, virtualized workload or management platform. System administrators should always refer to the documentation that accompanies critical applications for setup and virtual server tuning advice. Take the time to test any changes under realistic load conditions and document performance improvements before rolling any changes out into the production server environment.


Next Steps

Mistakes that can kill VM performance

Why won't my virtual server run at full speed?

Must-have features for virtual server performance management tools

Dig Deeper on Virtual machine monitoring, troubleshooting and alerting