This article can also be found in the Premium Editorial Download "Virtual Data Center: High availability and virtualization."
Download it now to read this article plus other related content.
Data centers have long relied on high-availability
Requires Free Membership to View
|
||||
Although virtualization supports enormous consolidation in data centers, it also increases the risk for organizations. Companies that deploy server virtualization need to re-assess their high availability (HA) approach, drawing on traditional network architecture while exploiting the flexibility and features made possible by virtualization platforms.
The basics of high-availability solutions
To understand the changing opportunities of HA in a virtualized server environment, it's important to appreciate the characteristics and tradeoffs of traditional high-availability solutions in a nonvirtualized setting.
In their most basic form, high-availability solutions provide redundancy while eliminating single points of failure. An HA installation, for example, may include two or more identical servers interconnected to two independent Ethernet network switches. These servers, in turn, may be interconnected to two independent Fibre Channel storage area network (SAN) switches that are interconnected to two redundant storage devices.
Each set of devices is ideally powered by different electrical distribution circuits supported with independent uninterruptible power supply systems. Redundant servers are also connected to one another through redundant health monitoring -- or "heartbeat" -- signals.
The servers themselves typically incorporate their own resilient design characteristics, including multiple-processor cores, extensive memory, redundant power supplies and network/storage connectivity.
How high-availability solutions work
Each server hosts duplicate operating systems, applications and HA failover software, such as platform-specific products that include Windows Server 2003 Cluster Server, the Solaris Cluster from Sun Microsystems Inc. (now owned by Oracle Corp.) and IBM PowerHA for AIX (HACMP), or may use cross-platform offerings such as Symantec Corp.'s Veritas Cluster Server or IBM Tivoli System Automation.
|
||||
When disruptions in a heartbeat indicate problems within a server, the failover software switches automatically to an alternate server where SAN and LAN access continues with little -- if any --interruption. Redundant-LAN switching, SAN switching and duplicated SAN storage ensure that any disruption outside servers can find an alternate path for continued operation.
Once the trouble is isolated and repaired, the system "fails back" to its original configuration. This is often called an "active/passive" configuration. In "active/active" setups, the second server operates in tandem with the first -- rather than as a spare -- to provide greater processing power for a workload. But when the companion server fails, one server can continue operation at a diminished level. With countless variations to this basic approach, the cluster itself can potentially include three, four or more servers.
The cost of nonvirtualized high-availability solutions
Of course, the traditional, nonvirtualized approach to high availability also carries a high price tag. Redundant servers, LAN networking, SAN networking and storage, and OS/application software licensing dramatically bump up the cost of high-availability solutions within enterprises.
Defining availability requirements is the first order of business, according to Dave Sobel, the CEO of Evolve Technologies in Fairfax, Va. Sobel said that management needs to define uptime -- that is, the number of "nines" -- in relation to business needs and budget. This table shows what the number of nines can mean for an organization's downtime over the course of a week, month and year:
Rated availability |
Annual downtime |
Monthly downtime |
Weekly downtime |
|||
90.0% |
876 hours |
36.5days |
73 hours |
3 days |
16.8 hours |
0.7 day |
92.0% |
700.8 hours |
29.2 days |
58.4hours |
2.4 days |
13.5 hours |
0.6 day |
95.0% |
438 hours |
18.3 days |
36.5 hours |
1.5 days |
8.4 hours |
0.35 day |
98.0% |
175.2 hours |
7.3 days |
14.6 hours |
0.61 day |
3.4 hours |
0.14 day |
99.0% |
87.6 hours |
3.7 days |
7.3 hours |
0.3 day |
1.7 hours |
0.071 day |
99.5% |
43.8 hours |
1.83 days |
3.7 hours |
0.15 day |
0.84 hour |
50.5 mins |
99.8% |
17.5 hours |
0.73 days |
1.46 hours |
87.6 mins |
0.34 hour |
20.4 mins |
99.9% (three 9s) |
8.8 hours |
0.37 days |
0.73 hours |
43.8 mins |
0.17 hour |
10.2 mins |
99.95% |
4.4 hours |
0.18 days |
0.37 hours |
22.0 mins |
0.085 hour |
5.1 mins |
99.99% (four 9s) |
0.88 hours |
52.8 mins |
0.073 hours |
4.4 mins |
0.017 hour |
1.0 min |
99.999% (five 9s) |
0.088 hours |
5.3m mins |
0.0073 hours |
26.4 secs |
0.0017 hour |
6.1 secs |
99.9999% (six 9s) |
0.0088 hours |
31.6 secs |
negligible |
2.63 secs |
negligible |
0.61 sec |
These calculations are approximations based on 8,760 hours in a year. Calculated downtime figures suggest "unplanned" downtime. All systems plan for regular downtime, which is not figured into the table.
Business requirements must drive the technology, but the expense of high-availability solutions limits the number of applications that can be protected affordably. As a consequence, only a few critical applications receive HA protection, while other applications are relegated to periodic snapshots or backups.
About the Author
Stephen J. Bigelow, a senior technology writer at TechTarget, has more than 15 years of technical writing experience in the technology industry. He has written hundreds of articles and more than 15 feature books on computer troubleshooting, including Bigelow's PC Hardware Desk Reference and Bigelow's PC Hardware Annoyances. Contact him at sbigelow@techtarget.com.
This was first published in March 2010
Virtualization Strategies for the CIO

Join the conversationComment
Share
Comments
Results
Contribute to the conversation