CHICAGO -- IT has become a service, according to analyst Jean-Pierre Garbani. Therefore, it only makes sense to take a service approach toward managing an enterprise network.
Garbani, a director with Cambridge, Mass.-based Giga Information Group Inc., made a case at the recent Networking Decisions conference for dumping the "obsolete" model of network management that relies on systems software to ensure availability.
In what Garbani called the "1995 model," companies bought management platforms such as IBM's Tivoli, Hewlett-Packard's OpenView or Computer Associates' Unicenter and called it a blueprint for network management. "The problem is, this is not a blueprint, it's a product suite," Garbani said.
With this approach, it's difficult for administrators to get to the source of a problem when a server remains up or a network switch is working, he said. The problem could be with the management software or the application software. There's no easy way to tell in a silo-based architecture.
A focus on service
A "service-assurance" model for network management takes a different approach. Garbani said there are two parts to it.
The first part is a real-time component that allows users to alert administrators that there is a system problem. Administrators can then identify the problem and take immediate corrective action, such as restarting scripts, to bring the service back online. "Don't try to resolve everything at once," Garbani said.
Establishing a trouble-ticket system helps facilitate such real-time action, and the help desk that fields the trouble calls serves as the "hub of communication" in an organization, he said.
The second part of the service model is a deferred-time component in which administrators can take a more in-depth look at why a failure took place and then figure out how to implement a permanent solution, such as changing the load-balancing system.
Availability formula flawed
As far as system downtime goes, Garbani said most network management products are based on a flawed formula. To calculate system availability, Garbani said, these products take the mean time between failure (MTBF) for a network device and divide it by the sum of MTBF and mean time to repair (MTTR). "I contest that," he said. "Availability is a probability to accomplish a certain task."
And downtime is a failure that users have little patience for. Giga research indicates that 15 seconds of network downtime is the limit after which user productivity is severely impacted.
As with any IT project, the biggest obstacle to developing a service-oriented system of network management is people, Garbani said. He recommended that companies develop such a system over a period of time, rather than taking a "big bang" approach.
Garbani said IT managers also need to identify the skills that their workers will need to establish the new architecture as well as assuring people in the organization that such a change is not a threat to them.
"That's why training is essential to make an implementation successful," he said. "You've got to make people proud of what they've achieved."
Stepping away from silos
John A. Strege, director of distributed operating systems software for the Chicago Board Options Exchange, called Garbani's service-assurance model a "noble goal." Anything that moves away from application and system silos is a good thing, he said.
"I've seen it happen where you verify your piece [of the system] is OK and then [you have to] go look somewhere else to resolve the problem," Strege said. "A lot of the time it doesn't get you to the resolution that you're looking for."
But the success of the service-assurance model still comes down to getting the cooperation of others in the organization and helping them to "get the bigger picture," Strege said.
They apparently get the bigger picture at Progressive Casualty Insurance Co., the fourth-largest auto insurance company in the United States.
Paul McHugh, an IT professional at Mayfield Village, Ohio-based Progressive, said the insurer has gone to a service-based management model over the last four years. "And every year we redefine the process to try to get it closer to a model where we're really guaranteeing the users have access to their systems."
A financial incentive for the IT organization doesn't hurt either. McHugh said year-end bonuses for the IT and telecommunications units at Progressive are based on system availability.