ltstudiooo - Fotolia

Why would you need huge virtual disk files?

As more enterprises look to virtualize top-tier applications, many are seeing a growing need for huge virtual disk files.

Logical disk sizes have long been limited to 2 terabytes (TB) through the common use of logical block addressing where a 32-bit address accesses 512 byte sectors. This translates into 4,294,967,296 possible sectors at 512 bytes per sector; or (2^32 * 512) 2,199,023,255,552 bytes -- or simply 2.19 TB -- of addressable space on the logical disk. Although 2 TB of disk space is still plenty for many workloads, top-tier enterprise applications are gradually demanding more computing resources, and some virtual disk file sizes can eventually run up against the 2 TB limit.

IT old-timers may remember that logical block addressing (LBA) first appeared in the early 1990s as a means of overcoming the previous 504 MB disk size limit defined by antiquated cylinder/head/sector, or CHS, addressing.

Recent advances in hypervisors and the shift to 64-bit operating systems can overcome this LBA limit by using the GUID partition table (or GPT) partitioning scheme, which employs up to 64-bit addressing. This allows an astonishing theoretical limit of (2^64 * 512), 9.4 zettabytes (ZB) or 9.4 billion TB on a logical disk. In actual practice, this limit will be defined by the physical disk capacity itself -- there are no multi-zettabyte disks, yet -- but hypervisors like VMware ESXi 5.5 now allow up to 64-TB VMDK file sizes.

Of course, this is not the first foray into huge disk volumes. For example, in-guest volume manager software could connect multiple 2-TB virtual disks; physical-mode raw data mapping, or RDM, can support 64 TB volumes and up to 3 PB per VM; in-guest iSCSI can handle up to 16-TB devices; in-guest network file system, or NFS, can support large volumes within a storage array; and VMDirectPath I/O can assign host bus adapter or network interface card to a VM, allowing for extremely large volume sizes. However, each of these alternatives typically sacrifice some combination of virtualization capabilities, such as VM migration, snapshot support, API support, clustering or other functionality. So, while large VMDK files are certainly not a new idea, the alternatives have not been broadly deployed because the organization loses more than it gains. It is really the move to support huge VMDK files natively in ESXi 5.5 that promises to maintain broad functionality for the biggest workloads.

Beyond the sheer size needed to support the largest VMs (such as big data analytics engines), native hypervisor support offers other potential benefits. For example, eliminating third-party software, such as in-guest volume managers, simplifies management and support for the server -- there are fewer elements to pose potential interoperability and update problems. In addition, fewer (but larger) volumes are easier to manage and maintain because there are fewer elements to overlook or misconfigure. This often leads to a better use of storage space.

Dig Deeper on Virtual machine provisioning and configuration