There is an old saying: If you can't measure it, you can't manage it. The problem badgering network managers today is different. It's the case of too much data. There so much data, it's impossible to analyze it all. Even more treacherous is mapping all that data. With too much data, all you get is an impressive-looking plot that yields no real information.
IT has been teetering on the edge of this problem for a long time. We collect and store massive amounts of data. We quote all kinds of statistics on utilization, speeds and response times that describe what the infrastructure is doing but fail to convey the information our business colleagues need in order to act. The business guys don't understand the technology. The network folks just want to keep their jobs and the network functioning as it should.
What's a networking team to do? How can they move from the massive avalanche of available network data to meaningful information for business management? There is a process to help identify and convert performance data into business information. Here are four suggestions to help you decide what data is important and how to present it.
The burden is on you, but that's to your benefit.
First, realize that it is highly unlikely that anyone on the business side is going to make the transition to a network engineer. It will be far easier for you, or someone else on the networking team, to make the translation from technical performance to business impact than vice versa. Being the person to do that is a sure way to expand your horizons and find the most interesting things to do. It can also, incidentally, save your job.
Identify what's important.
Second, the object of the exercise is not to represent or recreate the state of the network but to provide a benchmark of performance. A benchmark is a way to keep score of how well or poorly the infrastructure is performing as it relates to the business. This is so you know when to take action. So you need to know the important performance metrics that are significant to the business managers. What are the key performance indicators (KPIs) that determine their bonuses and reviews?
Identify network performance parameters that map to KPIs.
Third, you've probably heard of the trap of "analysis paralysis." This happens when analysis prevents decision making. Instead of reaching a decision, you end up in a never-ending process of identifying, understanding and analyzing data. "Parameter analysis" is the analogous occurrence, when the process becomes bogged down in identifying and tracking down parameters but ignores their relevance.
Figure out what network parameters and/or event data maps to or influences KPIs. This can be operational data (percent guaranteed delivery of packets, duration of outages, etc.) such as appears in an SLA, but can just as well be nonfunctional (such as the ability to manage network elements on demand). The key is to recognize what parameters appear to move with KPIs. Then, report and alert business managers in terms of what will happen to their KPIs. Any business manager will react when told a change to the ordering process will cause customer satisfaction to plummet because orders will take 50% longer to process, while notification of a 20% increase in network traffic may elicit a somewhat different response.
Is it correlation or causation? Do you care?
Fourth, technical staff all too often get trapped in their own analytical elegance. They want to link cause and effect. This is the best strategy for diagnosis and repair, but it isn't necessarily the best approach when all you want is an alert to a potential violation of an SLA or an alert to a business manager of a negative impact on a KPI.
The key is to identify and track indicators that anticipate a KPI move – for better or worse. Correlated things move together -- but not necessarily in the same direction. Therefore, they serve as an alert to KPI movement. For alerting purposes, it doesn't matter whether the alerting event causes the problem or just moves in a fixed relationship with the problem – only that it happens with enough time for you to notice and take appropriate action to avoid or minimize a negative impact.
Therefore, be sure to consider nontraditional indicators, such as traffic arrival times. View the process end to end in order to identify upstream (or downstream) performance bottlenecks. These won't always yield an answer, but the idea is to understand that the network represents one link in an overall process. A look outside your immediate boundaries of concern could be the key to making your overall job a lot easier.
Founder and Partner, Ptak, Noel & Associates
Richard Ptak has over 30 years experience in systems product management. He was VP at Hurwitz Group and D.H. Brown Associates and worked at Western Electric's Electronic Switch Manufacturing Division and Digital Equipment Corporation. He is frequently quoted in trade press and is author of Manager's Guide to Distributed Environments.