Running databases on VMs: The time is now

Contrary to popular wisdom, virtual machines can handle the workload of production databases.

The notion that production databases are too demanding to thrive in a virtual machine (VM) just doesn’t hold up. In fact, shops that have learned how to harness the power of today’s multi socket, multicore commodity servers now use virtual environments on a daily basis. They’re reaping the savings of consolidation as well as the usual benefits of improved disaster recovery and performance management.

So it’s time to re-evaluate the possibility of virtualizing your databases. Not long ago, I predicted that enterprise operations centers with no virtualization plans would find themselves largely virtualized in three to four years anyway.

I was wrong. The pressure to virtualize has come on much faster than I had anticipated. In many Oracle shops, managers have focused on CPU underutilization. In a CPU-based licensing model such as Oracle Corp'ss, the database license can cost as much or more than the rest of the system stack. Also, virtualization’s rapid cloning easily accommodates and can clearly improve the management of the development lifecycle, load balancing and disaster recovery (DR).

As a member of an Oracle team with hardware and OS specialties; I’ve had the opportunity to see virtualization technology in action with database implementations. While the Oracle story differs in its specifics from the situation for other databases, the general point is likely to be the same: Advances in virtualization software and server hardware—including embedded virtualization—have forced a reexamination of the conventional wisdom about virtualization targets. We have been able to successfully integrate Oracle on VMware Inc's technology for significant production database work.

To gain experience and confidence, we recommend starting with lower-risk databases. Consider this general sequence for choosing virtualization candidates:

  1. No-brainer virtualization candidates
    • Development databases
    • Production LDAP (Lightweight Directory Access Protocol) or other meta data databases
    • Small, low-I/O production databases
  2. Gray-area virtualization candidates
    • Large OLTP (online transaction processing) databases3. Down-the-road virtualization candidates
    • Single, very large databases on a dedicated node
    • Clustered databases

The only implementations that we should rule out immediately are database instances that require more virtual cores than your virtualization tools currently support.

So why take a radical step like re-hosting database servers on virtual machines? Sure, the ROI, product lifecycle and business continuity advantages of virtualization have been well documented; but with databases, won’t performance ultimately kill you? We’ve certainly had our fill of hearing people say, “We tried virtualizing our database and the performance wasn’t good, so we backed out.”

By definition, a virtualized database is a more complex system stack than a database running native. But the approach is now out of the realm of a science experiment; even Oracle has finally endorsed virtualization. We have a name-plate food services industry customer with thousands of outlets, and for the past two years it has virtualized Oracle’s ERP E-Business Suite in production. This organization runs what I consider to be an unusually lean enterprise resource planning IT crew. I don’t think the company ever would have pulled it off had it not virtualized its stack.

The installation worked fine out of the box with no virtualization-specific tuning needed. Your mileage may vary, but for those of you who might have experienced initial performance problems—and that’s only a subset of the population—it’s high time to get specific about where the problems are in the system stack and to get busy tuning.

Optimal platforms for database virtualization

With an emerging technology like virtualization, at each layer it’s important to make choices that work well with one another. While the beauty of virtualization is that it lets you get so much out of commodity hardware, both the hardware and the software are new enough that they’re not all the same. In our case, we favor running Oracle on Linux and VMware, although the processor picture is more complex.
In the future, our portfolio of virtualization brands will expand fairly quickly, but as of yet we haven’t seen VMware’s competitors put the same combination of mature, solid, stable virtualization tools on the market. It appears that competition will come from Microsoft more than from any of the commercial Xen variants. Microsoft has suffered from an on-again, off-again virtualization strategy, but it seems to have established a direction with some staying power. So within a few years, you will probably have a choice of platforms to fit your environment.

Research firm Gartner Inc. indicates that 91% of all servers purchased in 2004were x86 and x64. In support of that statistic, over the past three years, easily 90%of our team’s inbound implementation inquiries were Linux related, including from customers with very large RISC(reduced instruction set computing)installations. When it comes to Oracle, you may want to seriously consider choosing Linux over Windows if your corporate culture will allow it. Even in a native environment, Linux is generally leaner and somewhat faster than Windows. As most of our team’s work relates to Oracle core technology, it’s significant that 85% to 90% of our customers’ virtual machines run Linux guest operating systems. Such high rates for Linux adoption confirm the fact that Oracle is solidly behind Linux as its primary port. But for those who can’t run Linux for one reason or another, Oracle database scan run on Windows guests, and we have customers doing just that.

As for which Linux, SUSE has demonstrated leadership in co-engineering with VMware to further reduce the virtualization performance wedge. That’s why, all other considerations being equal, ween courage giving the SUSE distribution priority for now. VMware has recently published a white paper that describes a significant, reader-repeatable benchmark of para virtualized operations in conjunction with a SUSE 32-bit experimental release. No doubt, similar news for 64-bit guests is on the way.

Because VMware now focuses the majority of its performance-engineering efforts on 64-bit guest operating systems, we recommend going that route. Obviously, if your application vendor supports only a 32-bit guest OS, you’re stuck with32-bit.

The rules for generic resource allocation virtual machines seem to be the inverse of native hardware sizing. With native you generally assume that more memory and CPU are better. But that’s not so with virtual machines. Allocating unnecessary cores to a virtual machine, for example, can induce some amount of unnecessary performance overhead. The same is true for allocating unnecessary memory. We encourage you not to get excited, for example, about VMware ESX Server3.5’s capacity of up to 64 GB of RAM per VM unless you have a specific technical reason to scale into those higher-memory resource allocations. You can compare the performance of storage area network (SAN) caching and/or server-side file system caching as an alternative to increasing database-managed cache.

Keeping process counts low aids virtual machine performance. To that end, you may want to evaluate your database’s connection-aggregation capabilities, which could also save VM memory.
A performance best practice is to install guest operating systems from scratch rather than converting existing native machines. It eliminates processes once needed only for native environments. When such unnecessary processes wake up in an otherwise quiet virtual machine, they can cause unnecessary hardware context switches between virtual machines.

What’s the best processor for virtual databases?

Disclaimer: We don’t sell hardware or take kickbacks for anyone else’s hardware sales.

We’re waiting for the Barcelona quad core chip from Advanced Micro Devices Inc. (AMD) for two reasons: hardware assisted memory virtualization and socket based software licensing.

Let’s start with hardware memory virtualization. More recent AMD Opteron processors and Intel Corp.’s Woodcrest quad-core both included hardware assisted instruction set virtualization. That was a step in the right direction, but it will likely pale in comparison to the impact of the Barcelona chip’s second-generation hardware-assisted memory virtualization, which AMD calls nested page tables (NPT). ESX Server 3.5 auto detects the combination of Barcelona and a 64-bit guest OS and invokes NPT unless configured otherwise. Intel trails in its release of hardware-assisted memory virtualization, which it calls extended page tables.

Hardware-assisted memory virtualization should particularly benefit virtualized Oracle database performance, as the Oracle database kernel itself only reads from and writes to RAM. Whether or not you use hardware-assisted memory virtualization, your performance will probably benefit from configuring your 64-bit guests with large pages (see Oracle Metalink Note 361323.1).As for the software licensing point, VMware licenses its products on a per socket, not a per-core, basis. In February2007, Oracle introduced the socket-based licensing only in its Oracle Database Standard Edition. Your Oracle Database Standard Edition four-socket license doesn’t cost a penny more for 16 cores in 64-bitquad-core chips than it does for eight cores in 64-bit dual-core chips. Since Oracle bundled Real Application Clusters (RAC) in the 10g Standard Edition license, that can represent an impressive amount of cluster database computing scalability if configured properly.

The hot setup for virtual database servers

As virtualization brings memory into the spotlight, we recommend virtual hosts with a minimum memory capacity of 128GB each. We prefer AMD’s Non-Uniform Memory Access (NUMA) model over Intel’s front-side bus for a memory-intensive workload like Oracle. The NUMA advantages don’t appear to be lost on Intel; it has announced plans to adopt NUMA in future chips.

For online transaction processing (OLTP) database performance in particular, you want to minimize the I/O virtualization wedge. VMware ESX Server 3.5 included significant I/O and network performance enhancements. The reduction in the I/O virtualization wedge benefits all database types: update and read-only, OLTP and Decision Support.

ESX Server 3.5 also introduced certification for InfiniBand. For both native and virtual environments, we strongly recommend InfiniBand because of its superior latency and off loading of I/O processing from the CPU. For optimal throughput and latency, we recommend InfiniBand cards in PCI Express sockets.

Keep your eyes on ESX Server 3i, VMware’s ultra-lightweight, landed-on motherboard 32 MB hypervisor. Hewlett Packard, Dell and IBM are tripping over themselves to get ESX 3i integrated into their servers.

Can you virtualize databases and consolidate?

Usually, the first impulse to virtualize is to eliminate underutilized servers. Should you mix production database workloads with nonproduction workloads on the same physical hosts? In a VMware environment, absolutely—as long as you don’t over commit resources such that the production workload cannot get its resources. A few years ago, we frowned on grafting on production workloads into production boxes. But now, as long as you manage the resources, virtualization removes those restrictions.

It’s common for shops to try to populate a host with virtual machines representing heterogeneous resource demands. Typically these VMs have varying CPU, memory, network and I/O demands.
Oracle still insists that you license the entire physical server even if only some of its virtual machines host Oracle databases. So it may be cost effective to consolidate all licensed Oracle workloads onto hosts largely dedicated to Oracle. Manual placement of VMs with dissimilar resource utilization patterns on a given physical host can be beneficial. But you may find that the automatic hot-load balancing in VMware’s Distributed Resource Scheduler reduces the need for manual resource based VM placement in your environment.

Layering clustered databases on virtualization

There are advantages to layering a database vendor’s clustering technology on top of virtualization:

  • Redundant high availability, or HA
  • Hot-load balancing application services without a disconnect and reconnect
  • Rapid horizontal scaling of additional database instances

In Oracle RAC, a single physical standby instance may not keep up with all primary site database instances. Virtualization and SAN-level asynchronous block-level transmission can provide a cost-effective alternative.

One of our customers is in the go-live stages with Oracle 10g Real Application Clusters on top of VMware ESX Server3.5. It has four ESX Server hosts, three of which have a 1:1:1 ratio of hosts to virtual machines to RAC instances. The fourth host is a hot spare with no VM, waiting to catch the fail over football if one of the other three fails.

Support: The ultimate barrier to database virtualization

Over the past three years, our team has encountered enterprises in the process of virtualizing just about everything other than their Oracle databases. Often the greatest challenges to database virtualization aren’t technical issues, but rather political ones. In native environments, CPU utilization continues to plummet. Many of our enterprise customers would probably be glad to get their servers’ average CPU utilization percentage into even the teens. If you pay a database license fee for all CPUs on your host, you may want to consider the merits of improving your license utilization by a factor of, say, three.

As for Oracle’s conditional support policy for VMware, we have never seen or even heard of a shop that had an Oracle bug or instability that turned out to be VMware related. We have heard of a shop that Oracle asked to prove that the problem also occurred on native hardware before proceeding with the support request (which is a highly unusual support stipulation), but then, of course, the problem had nothing to do with VMware.

Database virtualization remains a significant, largely untapped opportunity for data center optimization and business continuity. The virtualization tools and ROI are there today. It’s high time to get the planning ball moving on database virtualization.

About the Author

Dave Welch is a founding partner of House of Brick Technologies and an expert in Oracle performance, troubleshooting, hardware sizing and data integration. For14 years, Welch worked with Oracle and is a 10g Oracle Certified Professional. He’s been a key facilitator in the ongoing right-sizing of millions of dollars of hardware and software even prior to adopting virtualization. Two years ago, Welch pioneered the use of virtualization for partner delivery of Oracle University Real Application Clusters classes.

Dig Deeper on Using virtual machine appliances