Every time you talk with a virtualization or cloud visionary you hear the same lamentation: “The data center network is in my way.” Let’s forget for a moment that the main problem with networking infrastructure is the arcane architecture of kludges that have been heaped upon one another for the last 20 years. But focusing on the future, there are still no clear answers for the perfect
Ideally a virtualization networking solution should involve a Layer 2 / Layer 3 data center network fabric that enables any-to-any, non-blocking connectivity to support virtual networks and services. Juniper Networks Qfabric, for example, has promised exactly this, as have a few other vendors, including Avaya, Force10 Networks and Brocade Networks. But most vendors are still light on detail, and we must still see how closely these fabrics will be integrated with virtualization infrastructure.
In the best scenario, data center network fabrics have issues
Assuming that you can build an ideal Layer 2/Layer 3 fabric in your data center, you’d still be facing three fundamental challenges:
1. Server (or virtual machine) IP addresses cannot be changed during operation due to the limitations of TCP/IP protocol stack—(more specifically, lack of session layer in TCP/IP) and broken socket API, which failed to include DNS as an integral part of the protocol stack.
2. The only solution offered by virtualization vendors that allows us to move a live virtual machine around the data center with no session loss involves bridging (sometimes known as Layer 2 switching), which does not scale due to inherent limitations like broadcast/multicast/unknown-unicast flooding. We’re slowly getting rid of Spanning Tree Protocol and its delays with technologies like Transparent Interconnect of Lots of Links (TRILL), Shortest Path Bridging, FabricPath or VCS, but flooding still remains a problem even with these technologies.
3. Network-layer services (firewalling and load balancing) inserted in front of the virtual servers force the traffic to flow through fixed intermediate points (physical or virtualized networking appliances), resulting in traffic trombones.
The solutions to these fundamental challenges cannot be implemented purely in the networking fabric. The virtual switches embedded in modern hypervisors must clearly be involved. Unfortunately, so far only VMware and Cisco have delivered an architecture that directly integrates hypervisor-based switches with other networking devices (other vendors use vCenter hooks to modify their device configurations).
VMware's virtualization networking approach
VMware offers two virtualization networking solutions on top of its Layer-2 vSwitch: security-focused DVfilter API, which allows security appliances to intercept traffic between individual virtual machines and the rest of the network; and vCloud Director Networking Infrastructure (vCDNI), which uses a proprietary MAC-in-MAC encapsulation scheme to isolate virtual networks. Neither of these solutions solves the scalability issue. They both use bridging as the underlying network transport technology, and vCDNI exaggerates the tromboning issues by forwarding inter-VLAN traffic through vShield Edge appliances.
Cisco's virtualization networking approach
Cisco has a completely different architecture. Its implementation of Virtual Ethernet Module (VEM) contains VMware's vPath API that allows engineers to intercept and redirect network traffic as needed. This functionality is used in the Virtual Security Gateway (VSG), a firewall appliance that can be inserted in the forwarding path on an as-needed basis. The initial packets of a session flow pass through VSG, and the rest of the traffic within the same session is forwarded by the VEM, significantly reducing the tromboning effects.
Cisco is also focusing on Locator/Identifier Separation Protocol (LISP) in its Nexus 1000v. The Nexus 1000v enables virtual machine mobility across routed Layer 3 (IP infrastructure), eliminating the need for large-scaled bridged domains and finally delivering a scalable virtual machine mobility solution.
But Cisco’s architecture is still sketchy. Scalability of LISP has yet to be proven in large-scale data centers, and there might be integration issues between VSG and LISP. Furthermore, Cisco’s competitors claim that their data center fabrics deliver significantly better performance for a much lower price.
The multivendor approach to virtualization networking
Fortunately for the data center architects who believe in best-of-breed, multivendor networks, it's possible to combine VMware’s hypervisor with Cisco’s virtualization networking strategy and its Unified Computing Systems (UCS) blade server solutions. All of this can be integrated with physical networking infrastructure from Cisco or most other vendors. This is not necessarily the best of options, but the door is wide open for exploration.
This was first published in June 2011