Sergey Galushko - Fotolia
As IT starts building hybrid clouds, many administrators have found that system topology is more complex and its tradeoffs and balances different from traditional server structures. Many users report that they require more I/O performance than initially planned. Now that the industry is at a crossroads, replacing hard drives with much faster solid-state drives, let's look at the impact solid-state drives and related technologies have on the virtualized server cluster.
Debunking common SSD myths
Everyone agrees that even the slowest solid-state drives (SSD) are much faster than hard disk drives (HDD), but despite being on the market for eight years, local server SSD has yet to achieve ubiquity. This is, in large part, because SSDs allegedly cost more than HDDs and they have a finite write life.
These two beliefs are little more than urban legend. Evaluating multilevel cell (MLC) SSDs against comparable enterprise HDDs in distribution, we find that SSDs are cheaper than HDDs of the same size. The only exception to this is if you're trapped into "locked" drives with some of the traditional vendors, in which case you'll pay through the nose compared with hard drives.
Google has finally discredited the write wear-out myth. SSD vendors have made great improvements in write durability, and a recent Google report suggests we should forget the issue entirely. Most drives exceed the most likely wear-out date by a good margin today.
That leaves the complex question of where to deploy SSD and flash in the modern virtual data center or hybrid cloud. There are a number of separate use cases here. First, the boot process can be streamlined by adding a small local server SSD -- say 32 to 128 GB -- in a plug-in disk-on-module that connects to a Serial Advanced Technology Attachment (SATA) port. This looks just like a drive, but is much cheaper than the smallest hard drive and, more importantly, speeds up hypervisor or container host code booting dramatically. It can also serve as a local image cache, though that makes updating images a bit more complicated.
Avoiding local server SSD bottlenecks
Traditional views of the virtual cluster or cloud lean strongly toward servers being stateless from the perspective of tenant data. Everything is stored on network storage so that it is accessible to a new server instance if a failure occurs.
This can create two classes of bottleneck. First, loading instances requires a lot of network traffic. Workloads that create a lot of instances simultaneously can suffer from "Boot storms," where the network and/or network storage can saturate. One notorious example of this is the virtual desktop, where everyone starts work at the same time.
One solution for this is to use very fast, high-bandwidth all-flash array storage that can deliver millions of IOPS. Variations on the approach come with Fibre Channel SAN, iSCSI or network-attached storage connections. One useful feature in most of these appliances is data deduplication, which saves a great deal of space; for example, a single copy of Windows code files can deliver a Windows desktop to every user.
The second class of bottleneck is a bit more difficult to deal with. Network I/O is slower than local storage, especially when compared with super-fast Peripheral Component Interconnect Express (PCIe) SSDs. This becomes an even bigger issue when I/O is shared across many instances on the same server, a problem that will become even more serious as we move to a container model with perhaps two to three times the instance count per host. This is especially an issue with I/O intensive operations and databases.
Resolving this may require a local instance store, much like AWS and other major cloud service proxies offer on selected services. Instance stores are essentially server drives divided up between instances. Tenants treat them as local drives, but a power or server failure leaves the data in limbo, which limits their usage to being large forward caches or memory page files in many use cases. Even so, the performance boost is tremendous if these are local server SSDs.
The 400,000 IOPS of a typical MLC SSD is enough to service hundreds of instances in such a situation, and with the price differential actually in favor of local server SSD compared to performance drives, this should be a no-brainer. There may be some who argue that a $35 1 TB SATA hard drive will do, but that will average out to just 0.5 IOPS per instance, making it a non-starter as a solution.
The long-term benefits of SSD
Used properly, with changed data being mirrored out to a network drive, a local server SSD will boost virtual server performance to a comfortable level for most use cases. For beefier applications, such as big data analytics, though, multiple PCIe drives are needed to keep up with what is typically a 4 CPU server.
A caution when using SSDs: Deleting data on an SSD has a different meaning than deleting on hard drives. SSDs place dirty blocks, without changing their contents, on an erased block list. Then, periodically, there is a cleanup process to free that space for reuse. If an instance can read outside of its drive space, this data can be read back and hacked. Various methods exist for preventing this, but overwriting data is not one of them, since in a local server SSD any overwrite goes to a new block and the old block is marked for later erasure.
With that caution, using SSDs makes a great deal of sense if your virtual servers are I/O-bound. It's a low cost alternative to getting more servers to do the job.
Solid-state drive technology best practices
Reliability and performance cause rise in SSD adoption
Choosing between SSD-HDD and 100% SSD systems