Last week, I postulated that using big data for SDN is inevitable. Analytics -- more specifically big data -- will be essential in delivering network programmability and the larger promise of the cloud.
While most of the talk related to SDN (and really OpenFlow) has been about decoupling the control and forwarding planes, there is a subtle subtext that should pop out a bit more in the coming quarters. It is great to be able to program things centrally, but something needs to inform those decisions. If the network is treated as a resource rather than a collection of point devices, then there has to be information about that resource so the SDN controller can make the right decisions. That’s where big data comes in.
But as both SDN and big data are still nascent, exactly how they come together is still largely unknown.
When the data center is heterogeneous, that data must be exposed in a way that is platform-, technology-, and vendor-agnostic. There is a real need to have a set of common interfaces through which data can be mined.
I had the honor of moderating an industry panel in Palo Alto last week featuring Mazda Marvasti of VMware Inc., Klaus Oestermann of Citrix Systems Inc., Akram AbouEmara of Savvis Inc. and Prabakar Sundarrajan , who is with a networking incubator called The Fabric LLC. I came away from the event thinking two things:
- My original thesis -- that big data and SDN will intersect -- is correct. There was broad consensus on the need to use information that has been until now locked inside the network. Automation must be supported by much deeper intelligence about the state of the network, the devices connecting to the network and the applications running on the collective infrastructure.
- Bringing the power of big data to SDN in particular (and networking more broadly) is a nontrivial technical challenge. And the challenge is made even more difficult when you consider a third major industry trend: virtualization.
Using big data for SDN: The tools don't yet exist
The network is not defined anymore by the physical devices that provide connectivity. The periphery of the network has been extended into the servers with virtual switching and into the hands of end users through mobile devices. With the edges of the network being pushed further out and, in some cases, blurring with adjacent infrastructure areas like compute and storage, the total number of devices under management has exploded.
Rapid expansion has done two things: increased the amount of information (infrastructure state, application and end-user information) and dramatically grown the number of sources for that information. There is no doubt that enough data exists for it to qualify as big data, but getting to that information and using it to make real-time infrastructure improvements in an automated environment requires real innovation above and beyond what we know today. For example, if there are thousands of devices that are all generating data useful in managing the data center, how is that data collected?
What has typically been done through batch harvesting will need to evolve to a more real-time collection and grooming of information. This marks a shift in the granularity of both collection and processing. And doing this with the sheer volume of data we are talking about in a massively distributed environment might prove harder than you imagine. The Fabric's Sundarrajan was candid in his assessment that the technologies to do this simply do not exist today.
Using big data for SDN: Do we need standards?
Both VMware and Citrix were actually of like mind when talking about what would have to happen to make this work in a production environment. When the data center is heterogeneous (essentially every large deployment in the world), that data must be exposed in a way that is platform-, technology-, and vendor-agnostic. There is a real need to have a set of common interfaces through which data can be mined. The unstated but logical corollary is that once that data is crunched to drive behavior, there must be a similarly common means of pushing feedback back into the data center.
Standards have been a lively part of the SDN debate, but that discussion has been focused more on how forwarding is programmed into individual network devices. What Mazda and Klaus were suggesting, though, goes beyond forwarding state. The need is for common data APIs and, though it was not explicitly talked about, presumably a common data model to store and act on this data.
Once you've mined big data, then what?
Even if a wealth of data is available to make intelligent infrastructure decisions, the question of how to use that data is still unanswered. In a massively distributed environment (spanning both private and public cloud, in some cases), how do you optimize resource use? How do you make intelligent networking decisions that support application requirements?
More on big data strategy and the network
Big data meets network monitoring tools
Big data and network operations management
Big data in the cloud: The network challenges
Why big data matters in research institution networks
Traditional routing techniques, regardless of the protocol, eventually break down into shortest path algorithms (SPFs). With more input into routing decisions, more potential paths through the network and more real-time feedback about application performance, we will need to have more sophisticated algorithmic control. Essentially, the data center focus shifts from finding a path through the network to finding the right path for each application -- and in a perpetually fluid environment.
Despite the technical discussions of the night, perhaps the most salient point of the evening was offered up by Savvis's Akram AbouEmara in his closing thoughts. While everyone is focused on new technologies, new gear and new architectures, we would all do well to remember that this is all in support of the application. The nature of networking is shifting from connecting elements to supporting applications. While it might seem like an obvious point in print, most of the industry dialogue to date has been disproportionately focused on the how not the why. Akram's closing remarks, if nothing else, are a great reminder for all of us to keep our eyes on the prize.
About the author:
Michael Bushong is currently the vice president of marketing at Plexxi Inc., where he focuses on using silicon photonics to deliver SDN-based data center options. Prior to joining Plexxi, Mike spent 12 years at Juniper Networks. Leading Juniper's flagship OS product management and strategy teams, Mike drove Juniper's SDN strategy, including product plans around OpenFlow, path computation element, application-layer traffic optimization and BGP traffic engineering. Additionally, Mike led teams responsible for evaluating new SDN technologies and emerging SDN companies, with particular focus on partnerships and M&A analysis.