To support its high-performance computing (HPC) requirements, a research institute built a new 100 Gigabit Ethernet (GbE) backbone with Brocade.
Researchers at the Howard Hughes Medical Institute
“We have a lot of data moving from acquisition to storage, from storage to our high-performance computing environment or from storage to visualization, rendering and evaluation,” he said. “Right now everyone is producing a large amount of data, and moving that data from the point of acquisition to storage and to other places is a challenge because the network can be a bottleneck.”
Cicerchia worried that his legacy 10 GbE network based on Force10 Networks’ infrastructure was a potential bottleneck given the ever-increasing amount of data that HHMI’s researchers were pushing across the wire. Last winter, as HHMI was approaching its five-year network refresh cycle, Cicerchia decided to upgrade t o the fastest fabric available.
“We decided to undertake a network design that would give us very, very high throughput and low latency. We were coming from cores that had multiple 10 gigabit links bundled together -- anywhere from 20 to 40 gigabits -- so we decided to not even look at 40 Gigabit [Ethernet] technology and go straight to 100 Gigabit [Ethernet].”
A pair of 100 Gigabit Ethernet links from the core to every wiring closet
In the core of HHMI’s data center, Cicerchia has installed two Brocade MLXe-32 chassis paired together with Brocade’s Multi-Chassis Trunking (MCT) feature. MCT is a network virtualization technology that allows customers to operate two switches as one virtual device. Each chassis has 32 slots for Ethernet blades. HHMI has populated half of the slots on the MLXe-32 switches with 100 GbE ports for a total of 32 100 GbE ports among the chassis. The rest of the slots are populated with Gigabit and 10 GbE ports.
These two MLXe-32 chassis aggregate traffic from 16 MLXe-16 chassis that Cicerchia has installed in each of his wiring closets. Each MLXe-16 has a pair of 100 GbE uplinks that connect into each of HHMI’s core switches. Cicerchia said the pairs of 100 GbE uplinks are both active because Brocade’s MCT feature allows him to remove spanning tree protocol from his network.
To ensure high performance on his 100 GbE network, Cicerchia has deployed a separate campus network solely to support Voice over IP (VoIP) and video conferencing and to provide the wired backbone for his wireless LAN. The second network, which also aggregates to HHMI’s core MLXe-32 switches, includes multiple stackable Brocade FCX switches with Power over Ethernet (PoE) for laptops and wireless access points. Each stack of FCX switches uplink to the core via a pair of 10 GbE links.
In addition to aggregating both campus networks, the MLXe-32 chassis also serve as HHMI’s data center core, aggregating server and HPC traffic.
“In the data center we use top-of-rack switches from Arista Networks. Also, we have some Brocade FCX switches for lower priority cabinets, and we have some legacy Force10 top-of-rack switches, which we are in the process of [replacing].”
Early 100 Gigabit Ethernet network results; no more spanning tree
The cutover to the new network, which occurred in September, allowed HHMI to collapse its network core from four legacy Force10 chassis to the two MLXe-32 switches.
This architectural consolidation ensured lower latency while boosting port density, Cicerchia said.
“At the same time we went from the classic design using spanning tree to a full active-active design with MCT,” he said. “So the result of that is not only did I cut down latency by almost 50% by going from four switches to two, I also improved efficiency by 100% because I no longer have links used in just passive mode in spanning tree. The move to MCT has allowed us to fully leverage the entire infrastructure.”
The transition from a classic spanning tree design to the MCT design with the Brocade MLXe-32 core did run into a hiccup when Cicerchia was testing out the network.
“Due to our pipeline and resources, we could move everything out of spanning tree at the same time, so the MLX-e-32s, in addition to running MCT, are also configured to run 802.1w [Rapid Spanning Tree Protocol],” he said.
Unfortunately, some 10 GbE interfaces on the core switches were resetting and Cicerchia’s staff couldn’t determine the cause. He allowed Brocade to do a packet capture analysis to find the problem.
“We were able to isolate a software bug between MT and 802.1w, especially with the legacy Force10 top-of-rack switches. We knew Brocade couldn’t provide a quick patch so we modified our spanning tree configuration from 802.1w to 802.1s, and that stopped the problem. Since then, Brocade has issued a patch, but we wanted to give our network time to settle, and we didn’t have a management window to deploy the patch. We expect to do that during the Christmas holidays.”
Let us know what you think about the story; email: Shamus McGillicuddy, News Director.