What are the data deduplication options in server virtualization?
Data deduplication in virtualization was highly anticipated soon after VM adoption in the data center. There are numerous options in this space with all of the major storage manufacturers (EMC, Network Appliance, etc.) offering robust products and some niche players (Data Domain) which should be considered as well.
These deduplication engines are very capable, but one should consider how much is necessary for deduplication and what consequences there will be to backup processes. Deduplication technologies will need to address situations where a number of servers attempt to access one file: for example, 1000 virtual machines (VMs) all using one duplicate file for all operating systems and applications will certainly impact performance. Is it feasible to go down to one file, or can a few dozen files be maintained that all the VMs can access for performance bottlenecks? This is a major consideration for those seeking to deploy data deduplication in their virtualized space.
This was first published in April 2008