Sergej Khackimullin - Fotolia
Cisco earlier this year released its Network Assurance Engine, a new network analytics tool designed to provide customers with the intelligence they need to find and fix network performance problems. The software creates a map of an Application Centric Infrastructure fabric and, among other benefits, lets operators verify network behavior and determine whether it follows established policies.
In a Q&A, Sundar Iyer discusses Cisco's plans for the next generation of Network Assurance Engine. Iyer is a distinguished engineer with Cisco's data center switching business unit, as well as a co-founder of Candid Systems, whose technology was used in the development of the Assurance Engine.
Cisco has been working on the network verification technology that eventually became the Network Assurance Engine for quite some time. What was the early rationale for developing such a tool?
Sundar Iyer: We've been on the journey to build the Network Assurance Engine for more than 2 1/2 years. The main issue we were looking at in the advent of SDN was literally the fact that almost everything we do in networking, for the last 20 years, in fact, is reactive.
What does that mean for enterprise networking managers?
Iyer: We make changes to the network, and then we scramble to undo changes if we break something. If there's a security breach, we have to scramble to fix it and find the root cause of that error. If there is an audit issue because we're not compliant, we don't necessarily find that out for a year, after the audit is completed.
We wanted to see if there's a way to fix that, to basically make network operations proactive. And by proactive, we mean all forms of finding things before they occur, helping [network] operators plan their network better and debugging issues before they actually bite you.
What approach did Cisco take?
Iyer: We needed both real-time performance and predictive analytics, but at their broadest. To do that, we built a precise mathematical model of the entire network. Instead of reading back data and monitoring base applications, we read all the metadata generated by the network.
By metadata, we mean every piece of configuration that's entered into a network, every piece of dynamic state that enters the network. So, for example, you may have a security configuration; you may have a virtual LAN configuration; you may have a configuration for how your machines migrate.
Similarly, you have dynamic states -- let's say the Border Gateway Protocol that's imported from a LAN interface, or the virtual machine that migrated from one leaf to another. We read every piece of metadata we could get our hands on and build the model. Models aren't new; they've been used everywhere, from the space industry to the hardware industry, where we've used models to fabricate chips to make sure there are no issues or errors.
We applied the same approach to networking.
Fast-forward to today and what's needed in network verification. How does the engine complement what Cisco has done with Application Centric Infrastructure (ACI), as well as its intent-based networking initiatives?
Iyer: With ACI, you had the beginnings of intent-based networking. You'd capture your network intent in a high-level controller language. What we found is when we built mathematical models of the network, that knowing the intent of what the operator is trying to do becomes much more effective.
To give you an analogy, if you look at a self-driving car, [the sensors and lasers and other equipment installed on the vehicle] allow you build a model of how a car should weave around traffic.
But having the intent makes [the self-driving car] more meaningful if the operator says, 'I would like to go from San Francisco to Denver on this day and this time and reach Denver in so many hours.' That intent, along with the models [supporting that intent], makes it all the more valuable.
The Network Assurance Engine supports the Nexus 9000 switches in the data center. Is Cisco working to expand the number of network verification components the engine can support? And can it be used in non-ACI networks?
Iyer: We began in the data center, and that has been our focus. We support all of the Nexus 9k platform with all versions of ACI, and we've also added support for third-party devices, so that includes load balancers and firewalls.
We're testing other products in our lab, which will be available in the next version of the engine. Our vision for the engine is much broader than we're at today, and we will continue to add these third-party components to Assurance [Engine] so that it allows you to do some very interesting use cases.
At this point, the engine is only compatible with ACI networks, because we had to start somewhere, and ACI was the only controller in the data center from a Cisco perspective. But we definitely will be looking at additional controller support. OpenStack is a possibility.
What other network verification enhancements do you plan to add to the engine?
Iyer: Our immediate focus is to increase our footprint in the data center, so we will be adding more devices, more controllers and more orchestration. Firewalls and load balancers are key, because everybody has a firewall and a load balancer in the data center.
Sundar Iyerdistinguished engineer with Cisco
We're also close to supporting vCenter. That's actually already built-in; it's tested in the lab. We will be announcing that very soon, and, again, almost 80% to 90% of our customers use some form of VMs [virtual machines] and vCenter.
We're also looking at two other kinds of support when we look at an 18-month time frame. The first, which is actually already imminent, is third-party software. But to give you an idea, the kinds of software we're trying to integrate are from other vendors whose products monitor different aspects of the network. One example is Turbonomic [and its monitoring software].
Turbonomic looks at VM data, and they build their own sort of monitoring tool. If you marry that with the Network Assurance Engine, customers will receive even more information about their data centers.
Let's say you have a VM that's sitting behind a leaf that is highly loaded, and it has policies that are going to overflow the TCAP [transaction capabilities application part]. The Assurance Engine would then tell Turbonomic to switch that VM to a different leaf.
With all of the attention now being paid to automation and verification, are you hearing concerns from your customers about what all of this will mean to their employees?
Iyer: Almost every generation of automation and technology only ends up making engineers even more efficient and allows them to focus on the next level of problems. So, we're not hearing any pushback saying, 'Oh, you're going to automate this, and this is a bad thing, and it's going to take jobs away.' It's more the opposite.
Customers really want this. We've determined that more than 60% of errors in data centers come from human configuration errors alone. Customers are familiar with this problem, and they really want solutions that will fix these kinds of problems. We've analyzed more than 40 customer fabrics, and we found more than 35 big-ticket outages we were able to prevent [as a result of the engine] before they even occurred.