The wizard gap is the distance between average network performance and what you really should be able to attain. It is hours of tweaking and fiddling by your best network engineers on each and every high-performance workstation and server that you care about. And it just keeps getting wider – fast! Learn to cross it or be left behind.
Back in 1999, Matt Mathis of the Pittsburgh Supercomputing Center first described the existence of a "wizard gap" and predicted that it would grow rapidly. By his estimation, the difference in performance for the average high-end host configured by an average user and one tweaked out by a network expert is somewhere around a factor of 1000!
What is the wizard gap?Consider connecting a typical workstation to a typical LAN in a typical network. A typical user can do this with a little help from DHCP. They will have connectivity, be able to run their Web browser, and access network applications. It is very likely that they will be quite pleased with themselves and be done. However, it is also likely that they will not see the full benefit of their network connection -- or anything even close, according to Mathis!
Without realizing it, users are often muddling through serious performance degradation issues that impact their productivity and keep them from using the resources they are paying for. These same issues can usually be identified and resolved by a determined expert (someone with network training and experience on current technologies) working over hours or days -- or even weeks. The difference between the performance accessible to the average power user and to a network expert is called the "wizard gap" -- it is the gap between what most networks do and what they ought to do.
What causes the wizard gap?Degraded performance is caused by a myriad of things -- down level drivers, duplex mismatch, "conservative" capacity negotiation, bad cables or lossy media, insufficient send/receive buffers, constrictive TCP window size, route flapping, VPNs, firewalls, MTU black holes, etc. Some of them can be ascribed to the mid-path where proper switch and router configuration and maintenance can resolve them -- of course, most users won't be able to manage the mid-path although it would help them to know who to call.
The bulk of the performance issues can be attributed to the last 100 meters and the end-hosts. NICs, drivers, TCP settings, application and OS configurations, port negotiation, MTU discovery --- all of these, in their default form, can easily pose serious performance degradation. And most of them can be resolved relatively easily too -- if only they could be identified and diagnosed. Networks are simply too manually driven to optimize themselves. And it takes an expert to clean up most machines to the point where they work as advertised.
How to close the wizard gap?Let's assume several things:
- You have limited time, resources, and expertise,
- You want maximum performance from your network, particularly the critical high-performance aspects,
- You want to not only cross the gap but also keep it narrow into the future.
There are several keys to help you do that - but they are not all simple or necessarily easy. Let's take a look:
Your network engineers need to exercise increasing levels of network hygiene. In particular, whatever disciplines are in place (e.g. "manually set all port speeds and duplex - no auto negotiation") need to be followed more rigorously. And as new types of issues arise, new disciplines need to be developed.
Provide engineers with access to critical technology advantages that shorten time to resolve issues (TTR) and increase productivity and efficiency -- allowing them do more with less. Connect them with effective sources of troubleshooting information, processes and support. The junior grade engineers should be provided with tool sets that give them the capabilities of seniors -- and the seniors should receive cybernetic implants!
Provide users with a means to reliably diagnose their own problems (but without allowing them to create more). Properly trained, users are very sophisticated performance monitors (and there is at least one per machine already!). Implement a front-line support group that keeps users from engaging the 'costly' engineers to help them to resolve their own problems.
Intelligently invest in new technologies
Today, networks are barely past the manual stage of IT evolution with few, if any, reliable automated behaviors so people have to do most of the work. That's where the inefficiencies lie. However, there are a range of new technologies emerging that offer some key advantages at the network level (The new network science). By identifying which ones will directly substitute for some of your existing administrative practices (as opposed to changing your practices radically), you can make significant advances in the network with limited ripple elsewhere in your organization.
Being a wizard is a good thing – but needing one is bad. Close the gap. Peak your performance.
Chief Scientist for Apparent Networks, Loki Jorgenson, PhD, has been active in computation, physics and mathematics, scientific visualization, and simulation for over 18 years. Trained in computational physics at Queen's and McGill universities, he has published in areas as diverse as philosophy, graphics, educational technologies, statistical mechanics, logic and number theory. Also, he acts as Adjunct Professor of Mathematics at Simon Fraser University where he co-founded the Center for Experimental and Constructive Mathematics (CECM). He has headed research in numerous academic projects from high-performance computing to digital publishing, working closely with private sector partners and government. At Apparent Networks Inc., Jorgenson leads network research in high performance, wireless, VoIP and other application performance, typically through practical collaboration with academic organizations and other thought leaders such as BCnet, Texas A&M, CANARIE, and Internet2. www.apparentnetworks.com