Ethernet has been the LAN technology of choice for decades, so it's not surprising that the role of Ethernet in data center networks has been growing. Pressure to conserve costs, consolidate data center resources, and support new software models like SOA have all combined to create a radically new set of demands on data center Ethernet.
There is little doubt that Ethernet will ultimately meet these demands in data center networks, but there's also little doubt that both the evolution of Ethernet technology and the evolution from the older Ethernet LAN models to new models will challenge data center network plans for three years – or throughout the current refresh cycle. As this transition occurs, data center and networking teams must learn ways of optimizing Ethernet as it stands. That will mean addressing latency and packet loss through a series of strategies that range from isolating storage networks to choosing the right switches.
Data center Ethernet: Beyond basic connectivity
Data center networks have evolved from their original mission of providing connectivity between central computing systems and users. There are now three distinct missions, each with its own traffic and QoS requirements to be considered:
- Traditional client/server or "front end" traffic, which is relatively insensitive to both latency and packet loss.
- "Horizontal" or intra-application traffic, generated in large part by the trend toward componentized software and service-oriented architecture (SOA). This traffic requires low latency but is still somewhat tolerant of packet loss.
- Storage traffic, created by the migration of storage protocols onto IP and Ethernet (iSCSI and FCoE). Storage traffic is sensitive to latency, but it is even more sensitive to packet loss.
That the different traffic types have different data center network QoS requirements is a challenge, but that they are also highly interdependent is perhaps a greater one. Transactions generated by the user in client/server communications activate processes that must then be linked horizontally, and these processes often access storage arrays. The performance of the application is then dependent on all of these flows in combination and on whether traffic from one competes with the others.
Addressing QoS in data center networks: Isolating storage networks
If inter-dependency is an issue, then one obvious solution is to compartmentalize the traffic on independent LANs. This may seem to waste the economies of scale that Ethernet offers in the data center, but for at least some users, it is completely practical to isolate storage networks at least while migrating them to Ethernet connections. Large data centers may achieve most of the economy-of-scale benefits even if their LANs are physically partitioned. Since storage traffic is by far the most sensitive to network performance, this can substantially reduce the difficulties of managing QoS for the three application classes.
It's important to note that VLAN partitioning will not normally achieve traffic isolation because traffic access to the inter-switch trunks and queuing management don't reflect VLAN partitions on most switching products. There is a series of standards currently being reviewed by the IEEE and IETF designed to improve enterprise control over traffic classes in Ethernet networks, but the standards and their implementation continue to evolve. They include Data Center Bridging (DCB), Priority Flow Control (PFC), Lossless Service, and Enhanced Transmission Selection (ETS). Planners should review their vendors' support of each of these standards when laying out their data centers.
Steps to address latency and packet loss for data center networks
Even without standards enhancement, there are steps that can be taken to improve latency and packet loss. They include:
- Using the highest Ethernet speeds that can be economically justified. Latency is always reduced by increasing link and inter-switch trunk speed. Low link utilization, a result of faster connections, also reduces the risk of packet loss. Where it's not practical to up-speed all the Ethernet connections, at least ensure that storage traffic is routed on the fastest connections available.
- Using large switches instead of layers of smaller ones. This process, often called "flattening" the network, will reduce the number of switches a given traffic flow transits between source and destination. That reduces latency and the risk of packet loss. It also reduces the jitter or variability in both parameters and so makes application performance more predictable.
- Looking for cut-through switches that enable acceleration of the switching process as opposed to the store-and-forward approach. Store-and-forward switches wait for the entire data frame to be received before it is forwarded; whereas cut-through examines the packet header first and begins to queue it for output as it is assembled from the input interface. This reduces switch latency.
- Taking special care with the interface cards used on servers and for storage systems. Some are designed to offload considerable processing from the host, and these will offer better and more deterministic performance.
- Routing the most critical traffic through a minimal number of switches. That means servers and storage arrays that are normally used together should be on the same switch or, at the worst, on a directly connected switch and not a path that transits an intermediate device (some call this "routing traffic on the same switch layer" rather than up or down in the hierarchy of switches).
- Setting switch parameters on storage ports to optimize for lowest packet loss even at the expense of delay. Packet loss can be a disaster to storage-over-Ethernet protocols.
Separating user traffic for data center networks QoS
A truly optimum data center network is not something that can evolve out of a legacy structure of interconnected headquarters Ethernet switches. In fact, the integration of the "front end" or application access portion of the data center network with the back-end storage/horizontal portion is not likely to provide optimum performance.
User traffic could be completely separated on different facilities from the storage or horizontal traffic. If you plan to integrate these traffic types, you will be forcing your network to carry traffic that is delay- and loss-insensitive along with the traffic that is sensitive. That means you either have to overbuild or wait for standards to allow the traffic types to be handled differently.
At the minimum, it is important to ensure that the interior applications like storage and SOA networking are not mingled in a traffic sense with user access. This will also help secure your data center assets. You must also be sure that discovery protocols like spanning tree used in the front-end LAN are not extended to the data center, where they can interfere with traffic management and affect the recovery of the network from faults.
Both products and standards for the data center network are evolving rapidly as applications like virtualization and cloud computing emerge. Planners should review their vendors' directions in data center switching and plan near-term network changes with the future architectures in mind.
About the author: Tom Nolle is president of Cimi Corporation, a strategic consulting firm specializing in telecommunications and data communications since 1982. He is the publisher of Netwatcher, a journal addressing advanced telecommunications strategy issues. Check out his SearchTelecom.com networking blog Uncommon Wisdom.