Problem solve Get help with specific problems with your technologies, process and projects.

Converged Enhanced Ethernet: New protocols enhance data center Ethernet

10 Gigabit Ethernet performance in the data center is improving. Network traffic and packet loss issues are being solved with new, more efficient protocols.

Price, performance and flexibility have made 10 Gigabit Ethernet (10 GbE) an attractive choice for the data center. While 10 GbE has made inroads, lack of features in existing Ethernet protocols limit its further penetration.

The critical issue with Ethernet is that it does not guarantee that packets will not be lost when a switch or end node is momentarily overwhelmed by incoming packets. The IEEE and Internet Engineering Task Force (IETF) are currently at work developing protocols that will improve network efficiency and eliminate situations in which packets are lost. Their work is considered critical to ensuring the performance of Fibre Channel over Ethernet (FCoE) and Internet SCSI (iSCSI).

Work is under way to address:

  • Traffic prioritization
  • Congestion control
  • Improved route selection

The set of protocols designed to address these issues has been named Converged Enhanced Ethernet, or Lossless Ethernet.

Traffic prioritization and control

A major advantage of 10 GbE over competing technologies is that separate networks for storage area networks (SANs), server-to-server communication and the LAN can be replaced with a single 10 GbE network. While 10 Gb links may have sufficient bandwidth to carry all three types of data, bursts of traffic can overwhelm a switch or endpoint.

SAN performance is extremely sensitive to delay. Slowing down access to storage has an impact on server and application performance. Server-to-server traffic also suffers from delays, while LAN traffic is less sensitive. There must be a mechanism to allocate priority to critical traffic while lower-priority data waits until the link is available.

Existing Ethernet protocols do not provide the controls needed. A receiving node can send an 802.3x PAUSE command to stop the flow of packets, but PAUSE stops all packets.

802.1p was developed in the 1990s to provide a method to classify packets into one of eight priority levels. However, it did not include a mechanism to pause individual levels. The IEEE is now developing 802.1Qbb Priority-based Flow Control (PFC) to provide a way to stop the flow of low-priority packets while permitting high-priority data to flow.

A bandwidth allocation mechanism is also required. 802.1Qaz Enhanced Transmission Selection (ETS) provides a way to group one or more 802.1p priorities into a priority group. All of the priority levels within a group should require the same level of service. Each priority group is then assigned a percentage allocation of the link. One special priority group is never limited and can override all other allocations and consume the entire bandwidth of the link. During periods when high-priority groups are not using their allocated bandwidth, lower-priority groups are allowed to use the available bandwidth.

Congestion control

802.1Qbb and 802.1Qaz by themselves don't solve the packet loss problem. They can pause low-priority traffic on a link, but they don't prevent congestion when a switch or end node is being overwhelmed by high-priority packets from two or more links. There must be a way for receiving nodes to notify sending nodes to slow their rate of transmission.

IEEE 802.1Qau provides such a mechanism. When a receiving node detects that it is nearing the point where it will begin discarding incoming packets, it sends a message to all nodes currently sending to it. Sending nodes slow their transmission rate. Then, when congestion is cleared, the node sends a message informing senders to resume their full rate.

Improved route selection

Spanning tree protocol was developed early in Ethernet history. It specifies a procedure to eliminate routing loops without requiring manual configuration. Switches communicate with one another to select a root node. Then each node determines its least costly route to the root. If a switch is added or removed or a link fails, the remaining switches communicate to determine a new root and new paths to it.

Spanning tree has worked well, but it has limitations. Traffic between nodes can flow through the root even when there is a more direct node-to-node route. There is no way to spread traffic over multiple equal-cost routes. Finally, the process of determining a new root and paths to it can be slow. Network traffic stops while the process takes place.

The spanning tree standard has been enhanced to provide separate sets of routes per virtual LAN and within sections of the network, but it still does not necessarily select optimal routes or take advantage of multiple links.

The limited processing power and memory in early switches dictated that calculations required in determining the root node and routes must be relatively simple. Processors and memory in current switches enable more complex route selection protocols. The IETF and IEEE are working together to develop IEEE 802.1aq and Transparent Interconnection of Lots of Links (TRILL). The goal of these efforts is to use a link-state routing protocol to determine the most efficient routes through the network, react very quickly to network changes, and take advantage of multiple routes by spreading traffic across them.

About the author:
David B. Jacobs of The Jacobs Group has more than 20 years of networking industry experience. He has managed leading-edge software development projects and consulted to Fortune 500 companies as well as software startups.

This was last published in April 2009

Dig Deeper on Data Center Networking