When it comes to cloud computing implementation, there's a lot of talk about sharing and managing compute resources...
and improving application experience, but rarely is there enough discussion about what makes an optimal cloud computing network. Yet when an enterprise decides to implement cloud computing on a large scale, it is committing to a significant shift in policy, planning practices, and application management -- each of these has network impact. That impact requires a new practice that could be dubbed cloud networking.
Network planners today recognize that "enterprise networking" is really a combination of "resource networking" and "access networking." The former connects the IT elements together to create data centers, and the latter allows users to access the applications running in those data centers. The transformation to cloud computing will change both resource and access networking, and it will add a new category: federation networking, or the networking of one cloud to another.
Cloud computing performance is the sum of the performance of the network connections and the performance of the IT resources. The task of the network manager in cloud networking is to fulfill two distinct missions: to create a resource pool with servers and storage that appear as much as possible like a single virtual resource with constant performance; and to connect that resource pool to users, regardless of their location, with minimal performance variation. It's easiest to accomplish these missions by addressing the issues in a specific order.
Cloud networking within the data center: Addressing loss and latency
In a cloud computing model, a resource pool is only efficient if all of the resources in it appear equal in performance and availability. That means that the network connections that build the resource pool are the most important of all.
Nearly all clouds are established by first building "data center clouds" using local network connections and then connecting these data centers. The two specific variables likely to determine success in data center networking are the "two 'L's" of loss and latency. All network protocols have to protect against data loss through retransmission of corrupted information, and loss of an information packet is particularly critical with storage protocols because of the risk of creating a corrupt file or leaving a storage device in a bad operating state. The problem is that retransmission of lost packets takes time, and latency is a special problem in data center and storage networks because it accumulates quickly across the tens of millions of operations involved.
Cloud computing network: Flat networks mean fewer interfaces along the way
Network specialists know that latency accumulates in networks largely in proportion to the number of interfaces a packet transits from source to destination, and each switch that handles packets poses a risk of loss, in addition to contributing to the total delay. The best way to reduce the "two L's" is to reduce the number of interfaces that traffic transits from source to destination. As a practical matter, that means reducing the number of switches.
Most data center network planners understand that the best network is one that is as "flat" as possible, meaning that the network should not include many layers of devices to create connectivity. A few very large switches will provide better performance than several layers of smaller ones, but concentrating switching into a few devices could increase failure risk too. This means that it will be very important for the switches to have the highest possible mean time between failure (MTBF) and also that the components be redundant and support automatic failover in operation.
If you can't go flat: Managing trunk and port connections in layered cloud networks
When multiple switch layers are required, a general rule of traffic management is to ensure that the trunk connections between or within layers are 10 times the speed of the port connections. For gigabit Ethernet ports, you'll need 10G trunks. Obviously, this kind of ratio will be impossible to achieve with extremely fast switch port connections to servers or storage, and in those cases flat topologies created by the so-called "fabric" switches (Infiniband is an example) will perform better.
Inter-networking data centers for cloud networking
Building a cloud normally means connecting the data centers to create a seamless resource pool, though that is not always the case. These connections must be as fast as possible to be effective, and it will be absolutely critical to manage packet loss.
Storage networking protocols and other protocols to provide for packet error recovery may be necessary in any case, but no protocols will compensate for high utilization on inter-data-center trunks. When utilization exceeds about 50%, both loss and delay will mount and cloud performance will be affected. This must be considered when managing traffic routes between data centers in your cloud.
Cloud networking traffic management: Start with user connections
Connecting users to the cloud is a good place to start your consideration of cloud traffic management. When users are homed into a specific single data center, their traffic will have to transit your inter-data-center trunks to reach resources sited in other data centers. That will quickly reduce performance.
It's best to ensure that the users (the facility and branch networks) are connected directly (homed) to multiple data centers and to control cloud resource allocation so that applications serving users are run in data centers to which the users are directly linked. That will save inter-data-center trunks for data exchange among application components.
Address management for virtual components in a cloud network
The type of network connections needed to support cloud computing are largely the same as those needed to support traditional client/server computing -- with one exception: In cases where resource locations are flexible, there must be a mechanism for addressing applications or components once they're assigned to a resource.
It's best to query data center networking and IT vendors for their strategies in address management for virtual components in a cloud. Solutions deployed today tend to be based on managing the domain name server (DNS) decoding of logical application URLs into IP addresses, or using a form of NAT (network address translation) similar to the "elastic IP addresses" used in Amazon's EC2. Amazon's elastic IP addresses are associated with users' accounts rather than an instance, and they can be used to remap public IP addresses to an instance associated with a user's account.
Connecting public and private clouds
It is almost inevitable that companies will deploy both private computing clouds and public cloud facilities. This may require hybridization of the two, creating connections to make the public and private clouds appear to be a homogeneous resource pool. That can be done either by making both clouds a part of a common VPN or by employing a form of federation networking using a cloud management and interconnection standard. Unfortunately, there is no solid standard for federation at this point, so it will be necessary to check with cloud providers and with your internal network and IT vendors to ensure that you have a compatible option for interconnection.
About the author: Tom Nolle is president of CIMI Corporation, a strategic consulting firm specializing in telecommunications and data communications since 1982. He is the publisher of Netwatcher, a journal addressing advanced telecommunications strategy issues. Check out his SearchTelecom.com networking blog Uncommon Wisdom.