In the early days of x86 virtualization, IT shops were so thrilled with the technology’s consolidation and mobility benefits that advanced network management wasn’t much of a concern. Today, previously acceptable virtual network architectures are now under a high degree of scrutiny.
The reason for the increased expectations is that a growing number of organizations are running production workloads on x86 virtualization platforms. The performance, availability and security of production applications often have an impact on an organization’s bottom line. That has made maintaining appropriate service levels for production applications a top priority for the IT staff.
Maintaining the virtual network infrastructure status quo is no longer acceptable in many cases. For some organizations, the status quo meant interconnecting virtual machines (VMs) using unmanaged virtual network switches. That’s the equivalent of interconnecting servers using off-the-shelf D-Link switches purchased at the local Wal-Mart.
To be fair, many organizations used the VLAN support offered by ESX-, Xen and Hyper-V-based virtual switches to provide logical Layer 2 isolation. Some were even using advanced ESX switch port group features such as promiscuous mode to allow one-way traffic mirroring within a virtual switch for the purpose of inspecting network traffic in the virtual infrastructure.
Until recently, Advanced networking features such as access control lists (ACLs), Switched Port Analyzer (SPAN), Remote SPAN (RSPAN) and 802.1x were not available.
If it ain’t broken, why fix it?
Of course, it’s easy to stick with the status quo, but there are plenty of reasons why existing virtual infrastructure network implementations don’t cut it:
- Virtual infrastructure network management is conducted by server administrators instead of network administrators.
- Limited network inspection and security enforcement capabilities require physical isolation for separation of security zones,
- Decentralized per-host virtual switch management requires rigid change control policies to ensure that changes made to one host’s virtual switch are duplicated on all hosts in a given cluster.
Flat IT budgets and continued pressure for IT infrastructure to grow has placed a greater emphasis on the need to reduce the amount of physical infrastructure required to support IT operations. You can reduce physical infrastructure more easily when VMs that reside in different zones of trust—such as separate internal departmental-level security zones—can safely reside within the same physical clusters.
Securely grouping VMs in different security subzones in the same physical cluster requires improved network intelligence, which means that basic virtual switches devoid of mainstream network management intelligence aren’t enough.
Also, if you’re going to up the complexity of the virtual switch infrastructure, you’ll need IT administrators with networking expertise managing them. This means that server admins will give up the role of administering virtual infrastructure networking components and return those duties to the networking group. Once under its control, the networking group can configure enterprise-grade virtual switches, such as the Cisco Nexus 1000V, to securely segment network traffic within the virtual infrastructure.
Basic network architecture principles
Most organizations deploy multiple physical subnets in combination with virtual local area network (VLAN) Layer 2 isolation to meet the virtual infrastructure’s performance, availability and security requirements.
For example, you can deploy separate physical network segments for connecting VMs to the LAN and Ethernet-based network storage as well as for the management console, cluster heartbeat and live migration data transfer traffic. Live migration—VMotion or XenMotion— requires a VM’s memory state to be copied from one physical cluster node to another. Today’s major hypervisors cannot encrypt live migration data. As a result, many organizations channel live migration traffic through a secure network management subnet.
Another assumed network best practice is the use of network interface card (NIC) teaming. NIC teaming gives resiliency to network failure as well as capabilities to load balance or aggregate network traffic across multiple physical network ports.
Hypervisor support for network teaming varies by vendor. For example, VMware and Citrix offer native support for NIC teaming, which allows administrators to configure NIC teaming in the hypervisor. You can use hypervisor-managed NIC teaming to team networks from different vendors as long as each interface is supported by the hypervisor.
Microsoft, on the other hand, relies on third-party network interface vendors to write teaming drivers, and teaming is managed outside of the hypervisor. This means that interfaces within the same vendor product family are required to team interfaces for use with Hyper-V.
When network administrators hear a product advertised as a Layer 2 switch, basic unicast traffic isolation—when point-to-point network traffic is not broadcast to every port in the switch—is often an assumed capability. However, not all virtual switches isolate unicast network traffic.
As a general rule of thumb, mainstream bare-metal (Type 1) hypervisors isolate unicast traffic and thus prevent one VM from capturing the network traffic of other VMs connected to the same virtual switch. Type 1 hypervisors that offer this level of isolation include ESX Server, XenServer and Hyper-V.
Many hosted, or Type 2, hypervisors do not offer unicast isolation in their virtual switches. The following companies offer hypervisors today with no unicast isolation capabilities:
- VMware (Server, Workstation and Player)
- Microsoft (Virtual Server 2005 and Virtual PC).
Unicast isolation capabilities are important because they form the foundation for security zone separation.
Security zoning restrictions—such as separation of a demilitarized zone (DMZ) and internal trusted zones— often dictate the configuration of both physical and virtual network devices. Many organizations dedicate separate physical clusters and network resources to hypervisors and VMs within different zones of trust, as within a DMZ.
If you’re considering internal zoning, the use of VLANs and 802.1Q VLAN trunking typically provide the necessary broadcast domain isolation to allow different internal subzones, such as accounting and human resources, to reside within the same physical infrastructure. Of course, the organization’s security policy, as well as the comfort level of the security auditor, will drive the network isolation requirements.
Finally, one other key consideration with network architecture involves traffic inspection and policy enforcement. In many organizations, network traffic is not examined within the virtual infrastructure, leaving traffic inspection and policy enforcement to be managed by physical security appliances placed within the physical network infrastructure.
Leaving the virtual network unchecked is an acceptable risk for some organizations, while others deploy tools from vendors such as Trend Micro, Catbird Networks, Reflex Systems and Altor Networks to enforce security within the virtual network infrastructure. Others deploy tools such as Blade Network Technologies’ VMready to move virtual network switching to a physical appliance, thus exposing virtual network traffic to all physical network management options.
Many organizations are feeling increasing pressure to do more with shared physical infrastructure. Deploying several small physical clusters to isolate security zones is a sound practice, but it often results in wasted free overhead and lower consolidation densities. Waste adds up to higher IT operation costs, and with no-growth IT budgets, that is a problem.
New options for virtual infrastructure
Several emerging technologies are beginning to influence how organizations deploy and manage networks within the virtual infrastructure. They are:
- Distributed virtual switch
- Single root I/O virtualization (SR-IOV)
- Multi-root I/O virtualization (MR-IOV)
- Converged Ethernet
- VM network load balancers
Distributed virtual switch: VMware’s vSphere 4.0 introduced a new type of virtual switch—the distributed virtual switch—which is available in the Enterprise Plus product tier. The distributed virtual switch, which VMware calls a vNetwork Distributed Switch (vDS), is a single logical switch that is managed across an entire cluster of ESX hosts. A standard vSwitch, now called a vNetwork Standard Switch, is managed on an ESX-host-by-ESX-host basis.
Standard virtual switches are not without fault. For starters, the standard virtual switch found on ESX, Xen and Hyper-V hypervisors is typically managed through the hypervisor management console by server administrators, leaving network administrators blind to their configuration. Also, dynamic VM movement via live migration, for example, between physical hosts requires identically named virtual switches on each physical host. Ideally, the configuration of each named virtual switch should be consistent on each physical host, which presents change and configuration management challenges.
The distributed virtual switch, alleviates common virtual switch management issues. The distributed virtual switch is a single logical switch that is configured cluster-wide. VMware was the first vendor to ship a distributed virtual switch, and Citrix will likely be the second. Citrix announced its distributed virtual switch plans in May and committed to delivering the first version by the end of the year.
VMware’s distributed virtual switch architecture allows third-party network vendors to plug in their own managed virtual switches to the VMware infrastructure.
Cisco was the first with its release of the Nexus 1000V. The Nexus 1000V allows administrators to extend advanced network management features—such as RSPAN, ERSPAN, ACLs and 802.1x—to a virtual infrastructure while also providing the framework to manage the 1000V alongside existing physical Cisco network devices. This allows network administrators to take back control of the network across both physical and virtual realms, which is how it should be.
Citrix is leading an open source distributed virtual switch initiative that will offer features similar to the Cisco Nexus 1000V. Extending Layer 2 network management and security features creates better opportunities for consolidating physical infrastructure and mixing security zones within the same virtual infrastructure.
Introspection: Hypervisor introspection interfaces emerged in 2009 with VMware’s VMsafe API as the first. Hypervisor introspection interfaces allow third-party products to inspect virtual infrastructure elements, such as VM network traffic and memory state, and centrally enforce security policy within the virtual infrastructure.
The Xen Introspection Project is working to deliver similar capabilities for Xen-based hypervisors. However, no timetable has been set for a product release.
The emergence of introspection creates more network monitoring and enforcement choices for admins. For instance, if an administrator configured network port mirroring ACLs (RSPAN, for example) and 802.1x on a Nexus 1000V distributed virtual switch and deployed security appliances that leveraged the VMsafe API to perform similar monitoring, then resources would be wasted by doing the same task twice.
Organizations will have to decide where they are most comfortable enforcing network security. Although they can use ACLs and 802.1x for additional security with the Nexus 1000V, organizations may prefer to allow VM safe-based security appliances to conduct the network traffic inspection that could also be achieved via RSPAN in the virtual switch.
With introspection technologies, keep in mind how they integrate with other elements of virtual infrastructure management. For example, today’s thirdparty VMsafe-based appliances, or VMware’s vShield zones, do not support zoning restrictions to third¬-party—or VMware—capacity management tools. This lack of awareness could cause a capacity management tool to overestimate available capacity.
In addition, lack of integration with dynamic load balancing tools such as VMware DRS could cause DRS to place a VM on a particular host, only to have the security VM appliance initiate another migration job because the VM’s presence on a particular host violates a particular security policy.
SR-IOV and MR-IOV: These technologies allow physical PCIe devices to be partitioned into multiple virtual devices. For example, an SR-IOV-enabled physical NIC could be partitioned into multiple virtual NICs, each with a unique media access control address. By moving virtualization technology to the physical NIC, hypervisors can give VMs pass-through access to a physical NIC while still preserving VM mobility.
One caveat, however, is that pass through access to SR-IOV-enabled NICs may require a driver in the VM guest. Requiring a hardware driver in the VM guest means that the same physical NIC type has to be present on all physical hosts in the VM’s associated cluster and would also be required at the data center’s disaster recovery failover site. Passthrough access provides near native performance, which is ideal for I/O-intensive workloads or for network security or load balancer appliances attached to the hypervisor.
When mobility is the greater concern, the hypervisor’s para-virtualized interfaces should deliver acceptable performance—often within 5% of native.
MR-IOV technology remains a work in progress. Unlike SR-IOV, VMs will be able to use MR-IOV to share memory and configuration space and multiple PCIe I/O devices. With MR-IOV, administrators or orchestration tools will be able to dynamically carve up shared physical resources within a blade chassis, for example.
Keep in mind that although added virtualization layers within the I/O path improve flexibility, they also add more layers of abstraction to the data path that can complicate administrative tasks, such as application troubleshooting and security and compliance audits. As SRIOV and MR-IOV become more mainstream, tools will likely be developed to provide the depth administrators need for application troubleshooting and data path visualization—for example, linking an application within a VM to dependant parts of the virtual infrastructure.
Converged Ethernet: As 10GbE continues to drop in price and with 10GbE adapters on the server motherboard coming soon, an increasing number of enterprises are looking at leveraging 10GbE to connect VMs to both the LAN and the SAN. Many organizations are initially looking to use converged Ethernet at the access layer—for example, the primary connection between physical servers and the network.
Beyond the access layer, LAN and SAN traffic may go their separate ways—to the Fibre Channel SAN and the Ethernet LAN. Using converged Ethernet should lead to a decrease in the number of physical ports you’ll need on the virtual infrastructure, thus lowering costs over time.
Also, 10GbE will give administrators new options for prioritizing LAN and SAN traffic. Regardless of your current I/O requirements, converged networking should be an architectural concern in coming years. Shared fabrics lead to greater economy of scale but present new challenges in quality of service, management and security.
VM network load balancers: VM network load balancers have been around for quite a while; Zeus Technology has offered VM network load balancers for several years. Citrix recently entered the mix with a VM version of its popular NetScaler appliance. NetScaler is a popular front-end appliance for service providers that offer application hosting.
Service providers or cloud providers that offer multi-tenant virtual infrastructures will continue to rely on front-end load balancers for the application monitoring and resiliency needed for many environments.
The emergence of the VM network load balancer allows organizations to end load-balancing functionality deeper into the virtual infrastructure. The expected lower prices for the VM-based load balancers should give organizations the financial flexibility to use load balancers where they’ve always made sense but could not be cost justified. There are already cases of organizations using load balancers to automate tasks in response to network conditions, and their role should continue to expand in coming years.
Laying the foundation for tomorrow’s technology
As you can see, a lot of new possibilities exist for interconnecting both virtual and physical networks. Currently, no one technology is the silver bullet, even though there are a number of emerging network technologies. Instead, you’ll likely use a combination of most of these technologies in coming years.
That said, it’s a good time to consider using technologies such as distributed virtual switches, security VM appliances, SR-IOV and MR-IOV, and VM network load balancers and to begin piloting new products. Much of the traditional physical network access layer, including switches, security and eventually routing functions, are moving to the virtual infrastructure.
Virtualizing network access layer devices opens up a great deal of architectural and administrative flexibility. It also improves VM mobility even more. At the same time, it creates new management and IT procedural challenges that you will have to address.
Converged Ethernet and improvements in technology for isolation, monitoring and security enforcement are laying the foundation for a virtual network infrastructure that securely supports multi-tenants or multiple security subzones. Security audit and compliance standards and interpretations will catch up to the new technology. Once they do, we’ll be at a point where we can call traditional physical network infrastructure and supporting physical appliances legacy devices.
About the Author
Chris Wolf, an analyst at Midvale, Utah-based Burton Group’s Data Center Strategies service, has more than 15 years of experience in the IT trenches and nine years of experience with enterprise virtualization technologies. Wolf provides enterprise clients with practical research and advice about server virtualization, data center consolidation, business continuity and data protection. He authored Virtualization: From the Desktop to the Enterprise, the first book published on the topic and has published dozens of articles on advanced virtualization topics, high availability and business continuity.