bluebay2014 - Fotolia

Get started Bring yourself up to speed with our introductory content.

White-box switching: Three paths to network programmability

White-box switching is often pitched as a way to cut costs, but the greater value lies in its ability to make networks more programmable and automated.

There are two magic words that can grab the attention of even the most skeptical IT professional: cost savings. And when the market was introduced to white-box switching -- commodity switch hardware that comes preloaded with a third-party network operating system -- the potential cost savings were often the first thing network engineers heard or read about this new approach.

But as interest in the concept of network programmability grows, it's become clear that cost savings are one of the less interesting aspects of white-box switches. In addition to saving money, they make the network more automated, programmable and flexible, which turns out to be where their real value lies.

The white-box switching market, which includes the software and the hardware it runs on, is expected to grow to approximately $500 million by 2018, according to Lee Doyle, principal analyst of Doyle Research in Boston. But it's still early days for this corner of software-defined networking (SDN), as the white-box switching and network operating system (OS) market is currently quite small. As a point of comparison, white-box switching vendors like Cumulus Networks and Big Switch Networks "have, maybe, $2 million in revenue" between them today, Doyle says.

There are primarily three types of companies, he adds, gravitating toward white-box switches: Web-scale companies that have the resources to deploy and maintain them, data center operators apt to take on more risks in greenfield deployments, and some companies that are cloud-based though not Web-scale.

Most enterprises and service providers, however, will not be happy with just any network OS. For most, their choice of OSes will ultimately be determined by the professional and technical backgrounds of the people who make the purchasing decisions.

"Server guys will pick Cumulus Networks or Pica8, while the networking guys will look at Big Switch or Pica8," says Joe Skorupa, a distinguished analyst at Gartner. "Cumulus Networks runs on white-box switches for use in data centers. People often use it to support SDN overlay networks, though it is not SDN itself."

In this feature, we explore three use cases for network programmability using white-box switches.

Managing switches like servers

DreamHost LLC, a Los Angeles-based Web hosting company and cloud provider, uses Cumulus Networks' Cumulus Linux network OS on white-box switches to efficiently scale and manage its open source, multi-tenant public cloud services: DreamCompute.

We treat our switches as just another type of Linux server.
Jonathan LaCourDreamHost

"Running Cumulus Linux, we treat our switches as just another type of Linux server," says Jonathan LaCour, vice president of cloud at DreamHost. "We use the same team, tools and processes to manage Cumulus Linux that we use for our Linux servers."

This was a significant departure from DreamHost's legacy switches, which ran on pre-installed proprietary software and tools that looked nothing like the Linux tools the IT team used to manage and monitor its compute and storage gear.

"Cumulus Linux helped us turn the network, which was a special case, into something that was not a special case," LaCour says.

Cumulus' white-box switches achieve this by replacing proprietary switch interfaces from large, legacy switch vendors with common Linux interfaces that all Linux server administrators can understand.

Running Cumulus Linux on white-box switches gives DreamHost increased performance and network visibility in a Linux platform, while enabling its engineers to use their existing Linux server administration tools for network automation. In the same way that Linux servers automatically configure upon installation, requiring no further attention except for small updates and changes, the Cumulus Linux controller automatically configures its white-box switches upon installation to serve the DreamHost network. The network comprises customer pods -- self-contained units of hardware that represent a single availability zone for DreamHost's cloud -- and command-and-control pods for management, running Cumulus Linux in a leaf-spine architecture over 40 Gigabit Ethernet.

"While network engineers program many switches today via automation, CLI [command-line interface] and APIs, Cumulus Linux is unique in that the CLI and API turn out to be the same Linux tools that every systems and cloud engineer has been using for decades, such as the route command and ipconfig," LaCour says. "There is nothing new to learn, and everything is hardware-accelerated."

There aren't many differences between a Linux server and a switch running Cumulus Linux -- the main discrepancy being the number of ports on each one. A Linux server has two to four Ethernet ports for Layer 2 and Layer 3 connectivity, whereas Cumulus' white-box switches come with 48 10-Gigabit ports.

DreamHost uses DevOps tool Opscode Chef for server and network orchestration. And the fact that each switch runs versions 2 and 3 of the Open Shortest Path First protocol makes network operations, from configuration to troubleshooting, easy. DreamHost engineers program Cumulus Linux switches using Chef cookbooks and recipes -- sets of reusable configuration instructions -- for inventory management and configuration. Its IT team uses the Python-based tool Graphite for monitoring.

How important are open network APIs?

Switches will fail, and when they do, Cumulus Linux enables DreamHost's network to keep routing data and running smoothly until the switch is replaced, which is done by removing the failed switch from the network and automatically reforming the network using the remaining switches. The controller then configures the new switch and adds it to the network automatically. This reduces mean time to recovery from hours to minutes.

But like any network platform, Cumulus Linux is not perfect.

"Network OSes are still playing catch up with traditional switching infrastructure when it comes to esoteric features," says LaCour. "This isn't relevant in our case because we prefer our underlying network to remain simple while we push any additional features [the white-box switches are missing] into the SDN overlay."

White-box switching simplifies network taps

To monitor traffic or troubleshoot network issues, enterprise network engineers often rely on network taps, such as sniffers or packet brokers. Service providers also use network taps to ensure service delivery and quality of service, using such tools to confirm the quality and delivery of traffic, or to proactively measure actual network performance against their service-level agreements.

Additionally, network engineers can use taps to pinpoint the origins of network congestion.

"If people are streaming YouTube and choking the core network, that could degrade the performance of SaaS applications or VoIP systems," says Steve Garrison, vice president of marketing at white-box switch vendor Pica8 Inc. Network engineers need to diagnose the cause before they can institute a cure, such as policy enforcement.

But the traditional approach to network taps can be problematic, especially in large-scale environments, where full visibility requires physical taps to be inserted in any device the traffic in question traverses.

The programmability in white-box switching offers an alternative. Pica8's software, for example, enables network engineers to program the kind of tap they want to use into the OS via an API and controller -- whether the tap is for mirroring traffic from a port, subnet or VLAN, or for aggregating all HTTP traffic passing through a switch to a single collector port. 

The benefit of this approach is that engineers no longer need physical taps on every switch. They can tap from one switch and then think where and what in the logical network they should probe.

"The ability to have a dynamic and programmable tap enables you to centrally manage network monitoring. You can attach the test tool to one port, and using programmability, you can sense any virtual or physical port throughout the network. You can move to different ports and look at different traffic flows from any given port," Garrison says.

"You could also filter out certain traffic and see what is left as a means to help isolate a root cause of a flooding event," he adds.

Building the network you want

The University of Texas at San Antonio (UTSA) is home to the Open Compute Project (OCP) Certification and Solution Laboratory, which is the first cloud and big data research laboratory of its kind in North America. The lab certifies Open Compute technologies and key workloads for large enterprises while educating its students about open source technology. Its network operates on Cumulus Linux on top of the Open Network Install Environment (ONIE), which is currently the best boot loader for open networking on white-box switches, says Carlos Cardenas, associate director of applied research in cloud computing at UTSA.

To support the lab's ongoing research and work on open source technologies, its data center must be able to expand and adapt rapidly. The lab chose to run Cumulus Linux on white-box switches to automate various parts of the network using a standard Linux OS and Linux tools.

Most of the network infrastructure in the UTSA lab is based on OCP-certified networking technology, with switches from Edge-Core Networks and Quanta Computer running Cumulus Linux. Its goal is to expand this architecture to the rest of the network, building it out with the same OCP-certified networking technology and switches that run Cumulus Linux. This kind of network will adapt to the increasing number of new, open source research projects.

With Cumulus Linux as the network OS, the lab's network supports a broad variety of easily accessible, open source server packages. This means the same software that is normally installed on Linux servers -- such as OpenSSH, OpenNTPD, isc-dhcp-server, DNSMasq and Quagga -- is natively available on Cumulus Linux. These are not variants of these software packages, as is typically found on traditional network devices, but the exact ones found in Linux distributions like Debian and Ubuntu. Switch interfaces appear to the engineers as if they were Linux servers -- only with 48 ports.

"We use the same configuration management interface on the switches that we use on all our servers," Cardenas says. "The administrator only needs to understand Linux to program the switch."

The mechanisms the lab uses for large-scale software updates and reconfigurations in network infrastructure are identical to the ones it uses for Linux servers. Cumulus Linux enables engineers working at scale to work with familiar scripts and APIs to program the network. Because the network OS is Linux-based, network engineers use their favorite automation tool -- Ansible -- to manage network configurations for their white-box switches. And with ONIE, the lab loads the network OS it wants on the hardware it wants.

"ONIE discovers the OS and installs it on the hardware," Cardenas says.

White-box switching not only gives the lab its choice of hardware and network OS, but it also allows the lab's IT team to choose from among the many Linux and Linux-compatible software applications to run on a programmable network. It's an approach that exposes an automated network to much less risk and human error than traditional switch architectures.

This was last published in December 2014

Dig Deeper on Network Infrastructure