Guido Vrola - Fotolia
Last week, despite my precarious perch on a ladder, I was smiling while silhouetted against another electric-blue, central Texas sky. A small flock of kids giggled and splashed in the crystal-clear pool below while I cantilevered, with boom-trimmer in hand, to shape a final palm frond. Then came the faint slip, followed by a slow-motion 10-foot fall to the deck, an improbable football-esque bounce and a sickening plop.
Fortunately, it wasn't me who fell. Instead, it was my trusty iPhone, which was now sunk like the Titanic into the deep end of the pool. My smartphone -- beloved, expensive and packed with critical electronics -- gone.
Or was it? Amazingly, I -- and it -- was spared. After a little bit of drying and some cleaning, the gadget rebooted as if nothing ever happened. Indeed, my iPhone was saved by convergence -- the same infrastructure convergence remedy that keeps thousands of data center headaches from erupting every day.
Data center thinness is where it's at
Not to ignite a flame war, but the iPhone 6 might be the single greatest gizmo I've ever owned. Yes, I love my Fluke meters; the drone is pretty fun; and you'll have to pry my home 802.11ac Aironets from my cold, dead hands. But the phone is special. Maybe it's that I jumped from the 4S, skipping the 5/5S; maybe I'm just happy to finally have a screen as big as an Android phone, or maybe it's the design. More likely it's that it never, ever causes trouble. Yes, super geek me carries iOS specifically because I don't have to think about it. (Android fans would say I don't get to think about it, and they'd have a valid point.)
As senior network engineers, we're also pretty much over the fun of emergency outage repair and constant fiddling that was once the hallmark of too many careers. In part, we've eliminated most of the avoidable issues with improved planning, new vendor features, and proactive network monitoring and management. But increasingly, one specific change in our data center infrastructures is actually dramatically reducing hardware outages even if it creates new complexity. That change is the accelerating convergence of infrastructure.
I got 99 problems but a switch ain't one
For an example of the availability benefit of infrastructure convergence, look no further than the lowly top-of-rack (ToR) switch. Once upon a time, piles of on-metal servers struggled to interconnect. ToRs were the standard approach, interwoven by a few expensive high-speed links to aggregation switches. In modern data centers, however, 10 GbE and 40 GbE ports are common, and ToR switches have largely been replaced: first by end-of-row switches and then directly by large, multi-module aggregation chassis. That may mean more interconnect cables, but the cabling and ports are more reliable than the network boxes the cable replaced.
With the exception of highly modular racks seen in containerized data center topologies, we've converged perhaps dozens of ToR devices into single elements. As a result, service failure rates have decreased. That may seem counterintuitive at first glance; after all, convergence decreases parallelism and, thus, would appear to create single points of failure that could generate larger potential impacts. The reality, however, is most aggregation switches were already single points of failure. Raising their profiles as highly converged infrastructure has driven vendors to improve reliability. Moreover, fewer boxes mean fewer failures in general; however the gear is distributed. Even better, it means less admin twiddling across siloed configs.
For system admins, the infrastructure convergence-reliability benefit is even more pronounced. With the move to virtualization, we've hugely decreased the number of chasses, powers supplies, memory sticks and other janky bits that downed applications mercilessly. Again, 100 virtual machines on a handful of hosts seem like a recipe for single-point disaster. But an individual chassis is ever more fault-tolerant to such discrete component failures as fans and memory. Storage area networks, meantime, made storage more reliable. Finally, dramatic reductions in the number of physical servers now let us fund true resiliency in the form of active-active standby and disaster recovery.
Convergence microcosm in a handheld device
Looking at converged infrastructure's next evolution, history suggests that increasing convergence equals increased availability, assuming it's done correctly. That's exactly what happened with consumer devices like my iPhone. In 2015, fewer than 20% of iPhone deaths will be by drowning -- in part because the number of interconnections between components in an iPhone or Android Galaxy is a fraction of what they were even two generations ago. There just aren't many places for the water to get into any longer. Even the glass and screen are laminated into airtight sandwiches for thinness. Correspondingly, in our data centers, there's just less and less to break or fat finger.
Geeks are finding that in technology emergencies, real service disasters are less likely with converged gear. In the case of my wet iPhone, I attacked every port with the Shop-Vac and then threw it in a bag of rice. Some 24 hours later, even after a full minute submerged at eight feet under water, the iPhone booted and worked again … perfectly -- unbelievable resilience, stemming largely from fewer parts.
Of course, shortly thereafter the pragmatic admin in me took over. Once I considered the long-term possibility of corrosion, I backed up the device, drove straight to the nearest Apple store and exchanged that iPhone for a new one.
Convergence and the private cloud
Many faces of converged infrastructure
Comparing converged products: What's best?