Server uptime and hardware failure guide
A comprehensive collection of articles, videos and more, hand-picked by our editors
What are the risks for workloads on virtual servers? Is reliability more important for virtual servers than for...
traditional physical server platforms?
Virtualization has vastly improved server utilization, allowing more workloads to run on fewer physical platforms. Although this has been a significant benefit to businesses, it has also created vulnerabilities that IT professionals must consider and address within the data center. Running more workloads on fewer hardware platforms carries additional risk for the enterprise because more workloads are impacted by hardware failures.
The principal benefit of server virtualization is improved resource usage; each physical server can run multiple virtual machines. This is fine under ideal circumstances, but the risk of a server fault or failure remains. Prior to virtualization, a server typically hosted a single application, meaning a server failure only affected that particular workload. When a virtualized server hosts five, 10, 15 or more virtual machines, a server fault can affect multiple workloads.
Workload recovery can take more time than administrators expect in a virtual environment. Consider that a virtual machine starts working when it's reloaded into memory, and that VM will demand a portion of computing and networking resources. This leaves fewer server resources and network bandwidth to restore subsequent VMs. A server with many VMs may experience significant downtime before all of the VMs are successfully restored and relaunched.
With the widespread use of virtualization, each physical server is now far more important to the enterprise because each is likely to be running several important applications. IT professionals must plan for server problems and contingencies. One strategy is to consider the workload distribution and stagger critical workloads across multiple physical servers. This prevents a single server fault from disrupting most (if not all) of the organization's critical applications.
Another important strategy involves failover and workload balancing. Rather than maximizing server use, administrators intentionally leave a portion of unused resources on each server so that VMs disrupted on one server can quickly be migrated to (or restarted on) another server. This allows the workloads of a troubled server to be moved to other servers while the afflicted system is serviced.
Over the long term, IT professionals want to prevent servers from failing in the first place. This normally involves selecting and upgrading server hardware designed and built with superior reliability components and features. For example, a business that chooses to increase its server consolidation level often acquires more powerful servers that have additional computing resources (e.g., CPU cores, memory, NIC ports and so on) along with high-availability features.
Related Q&A from Stephen J. Bigelow
Our AWS software is ready to deploy, but we're not sure how to monetize it. What tools or services does AWS offer for developer payment options?continue reading
Our enterprise wants to limit the time it spends generating Identity and Access Management policies. What tools are available to automate this task?continue reading
Multifactor authentication helps organizations verify account and user identities in the public cloud. But what do I do when my MFA devices fall out ...continue reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.