With the rise of server virtualization, host-based virtualization backup and recovery techniques overtook traditional agent-based backup tools. But new agent-based backup tools that work at the block level are challenging that conventional wisdom. Using inside-the-VM smart agents and volume subsystem filter drivers, these new technologies have begun to shift the attention away from host-level backup tools, and back in their direction.
Understanding the recent rise of block-level and agent-based backups requires a little background information.
In the beginning, there was agent-based backup
Before server virtualization, the Windows OS ran directly atop physical hardware, and the architecture of backups was simple, even if the everyday success of those backups wasn’t always guaranteed. Once a day, the backup agents scanned every Windows file system and gathered copies of any files whose time stamps had changed. This backup method was designed for the prevailing notion of the time, which was to focus on changes to individual files.
Also, during that time, backup infrastructure designs were also relatively simple. You needed tapes, tape drives, a backup application as well as backup agents on every computer. After connecting all of those pieces, the rest was daily maintenance.
Along came virtualization and the rise of host-based backups
Then, virtualization became all the rage. Almost overnight, a sizeable portion of those OS instances shifted from physical computers to virtual machines. And this infrastructure brought all sorts of benefits to IT, such as increased availability and efficiency.
Yet, those benefits came with a cost: Suddenly, designing a backup infrastructure became far more complicated. By virtualizing Windows, IT added new layers to the data center stack. Each layer created a potential location for a backup agent (i.e., inside the VM, on the virtual host, at the storage layer, outside the virtual environment).
Placing a backup client at each of those layers had its own set of benefits and drawbacks. For example, at the host layer, virtualization backup tools can easily capture entire virtual machines, but at the cost of requiring extra steps to restore individual files and folders. Backup clients at the storage layer enjoyed improved performance, but at the risk of inconsistent backups when all of the interconnections between the layers weren’t fully synced.
Enterprising software vendors quickly offered virtualization backup options at each new layer to capitalize on the benefits. And this desire to make backups easier was absolutely warranted. In the physical days, backing up a Windows OS required perfectly capturing all of the innumerable OS files. If you missed just one, you could be unable to resurrect your computer if it died.
In today’s marketplace for virtualization data protection, the average IT pro is faced with a litany of options. There are so many vendor products and approaches, so it can be difficult to ascertain how they differ. For the virtualization layman, the key difference among these products is how the VM data is captured onto a backup medium.
Did we go wrong with host-based virtualization backup?
One of the popular backup methods involves recording all of a VM’s disk contents into a single file, which is known as a host-based backup, imaged-based backup or the single-file approach. This method can ensure that backups are captured correctly. You just copy the backup file to the proper location. Then, power on the VM, and the restore is complete.
At first blush, the host-based virtualization backup approach appears superior to other methods. But new technologies that use volume-level filter drivers -- as opposed to tools that operate at the file-system level -- are beginning to show their value as potential challengers to host-based backups.
These vendors still espouse the old agent-based, inside-the-OS approach. What’s different is the where data is captured. Rather than focusing on changed files and folders, the new approach tracks changes that occur to individual disk blocks. Positioned there, these virtualization backup products continuously send a tiny stream of changed bits from each VM to backup medium. This architecture provides great performance, and it’s even better for ensuring that both virtual and physical machine backups use a unified solution.
These alternatives to host-based backups simplify restores, as well. Agent-based backups, for example, don’t have to restore an entire VM to access the data inside. Instead, they just restore the data using an automated process.
Other improvements to agent-based backup include application awareness and the ability to view backed up data, which is useful for testing and verification purposes. Ultimately, the new agent-based backup tools can restore files, folders and application objects, all without the focus on the VM disk file.
The current state of virtualization backup products
In today’s virtualization backup marketplace, it feels like history is repeating itself. Early physical backup tools focused on recording files and folders at the expense of recovering the entire machine. In many ways, virtualization and the second generation of data-recovery products shifted this focus -- enabling far-greater restores for entire machines at the cost of requiring extra effort to recover data inside the backup file.
While host-based virtualization backup tools are still popular, time will tell if agent-based and block-level products are the best method for data protection and recovery in the data center.