This morning I pulled off the slow-grinding freeway for coffee, as is my habit. Sure, a grandiosedouble-half free-trade...
caramel sustainable mocha drizzle non-fat with a shot is swell and all, but a small coffee with a little half-and-half for a buck works just fine for me. Pulling into the drive-through line I was shocked by how fast it was moving, and rounding the corner to the order window I could see that about half the cars were pausing for just a moment -- then launching away like F-18 fighter jets off the USS Nimitz. Then came the disembodied voice when it was my turn, "Welcome to McDonald's. We're only taking cash today; our credit card machine is broken. May I help you?"
What followed was a great real-world reminder of the criticality of our WANs and the real bottom-line effects of WAN failure.
Losing way more than pennies
We have a tendency to forget that WANs are decades younger than LANs.
For once, I actually had more than $5 in my wallet and pulled around to get my coffee and my change -- old-school mode; there were pennies involved and everything. In my regular coffee ceremony, I pull to the side of the parking lot, scan my email and appointments, collect my thoughts for the day and let the rush hour traffic die down a little.
This morning, however, my monitoring tendencies got the better of me and I started measuring the ratio between the disappointed-customer-pedal-mashing launch-aways versus the happily-order-drive-off-contentedly crowd. About three cars in five pulled away. That meant 60% of McDonald's key morning business was evaporating from this particular store. What's more, the cashier had told me the network "was down" for all the stores in this particular franchise. So, extrapolated to the franchise or perhaps even regional level, I was watching a big-name brand lose real money for every second of network downtime.
Forget Target's half-billion-dollar special excursion into firewall hell, this was more painful to watch! It was like sitting in the airport when an airline's reservation system goes offline and $100 million per minute flies off into the sunset. The difference is that airline passenger service systems (PSS) go down so rarely that it makes the news. Sabre once went six years without an outage of more than a couple minutes.
Unfortunately, however, WAN failures like the one I was watching happen all the time and we just accept them as part of business. Unlike a PSS, there are millions of points of failure and monitoring them all can be a real challenge or, we fear, expensive, so there's a tendency to just cross our fingers and hope. But with modern interconnected services, hope just doesn't cut it anymore.
Every link can affect the bottom line
It's easy to focus our availability monitoring and alerting on irreplaceable business umbilicals -- like the VPN links between the cloud and on-premise racks in the hybrid cloud and the software as a service (SaaS) options that connect to remote backup, etc., upon which your business depends.
But many businesses are increasingly distributed and there are dozens or even hundreds of other Internet links you need to know are up. Yeah, Salesforce is up from the core, but what about from your regional offices? Does every sales rep in your enterprise have what they need to hit their numbers? How about the healthcare collections desk, the upstream supplier just-in-time inventory tracking system and the point of sale PC? The list could go on.
Every day, critical endpoint tasks are migrating to our networks at the same time the services they depend on are moving out onto the Internet as SaaS or the cloud. We have a tendency to forget that WANs are decades younger than LANs and it can be easy to think of monitoring them only after the LAN is well instrumented.
The takeaways this morning were two-fold: First, the WAN isn't just critical for enterprise inter-campus or core to cloud, it's also critical for the final delivery of even the most mundane commodity product like fast food. Its failure had an immediate impact on revenue, with on-hand perishable supply losses as a special bonus. Second, the WAN can give a network admin a bad day in a hurry. Somewhere in that long line of cars creeping along the freeway was probably an admin rolling into what he expected would be a normal day. Then his phone blew up with the angry franchise owner on the other end well into DEFCON 1 and worsening every passing minute his registers were down.
I sighed and toasted the beleaguered, anonymous troubleshooter with my last sip of plain joe. I also added a to-do for today to double check my remote IP service-level-agreement pollers. I recommend you do the same.
About the author:
Patrick Hubbard is a head geek and senior technical product marketing manager at SolarWinds. With 20 years of technical expertise and IT customer perspective, his networking management experience includes work with campus, data center, storage networks, VoIP and virtualization, with a focus on application and service delivery in both Fortune 500 companies and startups in high tech, transportation, financial services and telecom industries. He can be reached at Patrick.Hubbard@solarwinds.com.