So, you've decided to move ahead with building, or transitioning to, a software-defined data center, or SDDC. At...
this point, walking down the hot aisle of your existing data center may seem like an exercise in frustration. What should you do with your existing equipment -- and the applications running on it? The answer -- as with most things in information technology -- is, "It depends."
There are two basic models to consider when moving data centers: running the old and new in parallel during some form of transition phase, or integrating your existing equipment with the newer SDDC. The second, integrating existing equipment within a single data center fabric, is not one, but two answers: integrating at the pod level, or running the SDDC over the top of existing equipment.
The first answer, running the old and new data centers in parallel, may seem like the simpler -- even ideal -- case. But even in the ideal case, there are issues to sort out. Will workloads be transitioned between the data centers? And if so, how will this take place? Much of the answer to this question is going to depend on the applications themselves, of course. There are several important questions to ask in this area.
How well will the new data center fabric meet the requirements for each specific application? It's important to take into consideration commonly considered issues, such as bandwidth utilization and delay and jitter requirements. But it's also important to consider the existence of such services as domain name system, dynamic management of elephant flows, the creation of security zones, overlay networks and other factors.
While a lot of problems related to services offered by the fabric can -- and should -- be avoided in the design phase, there will always be some that are missed. No inventory will ever be complete, at the very least, because few application owners will know all the services their application relies on, or they will make invalid assumptions in the process of performing such an inventory. For these situations, there needs to be a clear action plan in place when moving data centers from Day 1.
It's easy to assume every service can be supplied on the new fabric, but using this as a planning baseline will often lead to very bad results. It's better to assume application owners will need to modify or update applications to work around some of these problems, rather than throwing the entire weight of the problem on the network engineering team.
Applications will determine how data centers are linked
During the time when the two fabrics are running in parallel, there will need to be some form of connectivity between them -- a data center interconnect (DCI). Application requirements are going to determine some of what this DCI looks like, such as whether or not there needs to be an Ethernet-on-top connection, or alternatively, a simpler-to-support IP, or routed, connection. The challenges here are similar to the DCI challenges facing any other pair of data centers, with the added restriction of what the SDDC system will support and expect.
The second solution, integrating the SDDC and existing equipment at the pod level, presents a different set of challenges. The idea is illustrated below.
If there is no need to connect data center fabrics for resilience -- not likely in most modern networks -- this type of solution can remove one challenge from the list above: DCI. Another advantage is it allows you to use canaries -- that is, simulations -- to test your SDDC design approach for individual applications over time. In this situation, a canary would involve running the two infrastructures in parallel, moving applications from the legacy foundation to the SDDC to evaluate them, leaving them there if they appear to run correctly in the new environment. This is actually how most hyper- and/or web-scale operators transition to new infrastructures.
However, it adds a new element of complexity to consider: How will the SDDC control plane interact with the existing control plane? Somehow, traffic must be drawn from the newer SDDC pods into the legacy hardware and back again. If there are few traffic engineering, security and other policy requirements, this might be as simple as just redistributing routing information between the two control planes. If moving data centers require the inclusion of security zones that cross the two domains, or some form of dynamic traffic shaping, the problems here can be very complex. The most likely situation is some form of redistribution combined with manual or automated tuning along the edges between the two operational zones.
Such arrangements tend to start simply, but they also tend to end complex, consuming more resources than anticipated. It's best, if this is the chosen migration path, to push applications from one environment to the other as a set. This approach reduces the depth and breadth of the interaction surface between the two environments.
Running the SDDC as overlay network
The final option, mentioned in the opening paragraph of this section, is to run the SDDC as an overlay on top of existing equipment. This is probably the most common tactic sold by SDDC vendors, as it allows the SDDC to consume the existing equipment into its control and management planes. This, too, can appear to be a simple answer, but complexity can often play into the mix very quickly.
The general idea is to use the power of the SDDC to replace legacy equipment with new gear over time, using the capabilities of existing equipment as a physical layer for the SDDC. This situation should be no different than the normal lifecycling of equipment over time in an SDDC environment. To that end, the same tools and processes should be applicable from Day 1 until the day the legacy equipment the SDDC is replacing is removed from service. But the initial equipment mix cannot be as good of a match for the requirements of the SDDC as any future purchases, potentially leading to several problems.
At the physical layer, will the equipment support the southbound interfaces required by the SDDC? For instance, if the SDDC requires OpenFlow support at a certain level, such as 1.3, to operate properly, does all the existing legacy equipment support this level of operation? If the vendor claims support, has it been tested? To know for certain, all the existing equipment must be revalidated for operation in the new environment.
At the control plane, how will the SDDC overlay interact with existing control planes that tie the equipment together and draw traffic from one part of the fabric to another? Can all the features of the existing control plane -- features which tools and capabilities have been built around -- be integrated into the SDDC overlay? This is a more difficult issue to resolve if the existing control plane is some sort of fabric overlay designed to provide an API into the network, rather than a collection of devices running a more traditional distributed protocol -- such as IS-IS or Border Gateway Protocol.
Management approaches add to complexity
The problems multiply when moving from the control to management. Each device in the existing network is designed to be managed in a specific way. Some may only have management information base interfaces; others may only have command-line interfaces; others may have RESTful interfaces using a set of YANG models; and, still, others might be best managed through a gRPC interface.
Can the SDDC draw information from, and push configuration to, this wide array of interfaces across all devices? What pieces of telemetry might you gain, and what will you lose? This is another area that calls for extensive testing and validation, especially against future requirements. Never count on "the hardware will be replaced before we need that function" as an out. Think long and hard about where your applications may bump up against the walls of limited functionality in the future, and what that means for your business.
A parallel concern is the ability to troubleshoot and resolve problems quickly -- the mean time to repair a network is directly related to overall availability, a crucial measure of the network's effectiveness at supporting the business. Telemetry, in this context, allows you to see the condition of the network, in order to resolve problems before they affect operations, and to quickly find problems that are affecting operations. It is important to examine current processes used to quickly restore services against the capabilities of the SDDC overlay to determine where there might be any gaps.
Perhaps the one piece of legacy gear that will be the most difficult to manage through an SDDC transition is the appliance-based firewall. While widely deployed to create security zones within a fabric, and to separate zones within the fabric from zones without, appliance-based firewalls are likely to be the most difficult devices to effectively manage. Overlaying an SDDC on top of existing equipment will challenge appliance-based firewalls with tunneling encapsulations, dynamic policies and other issues that will be difficult to solve. In the overlay model, security will need to be rethought entirely, including how security zones will be migrated from existing appliance-based firewalls to other techniques provided by the SDDC system itself.
Moving data centers to an SDDC can result in a cleaner network over time, with many new options for building and managing a network at scale that meets business needs. The intermediate steps required to transition existing equipment to the SDDC environment, however, can be complex. Network operators need to consider these challenges when moving data centers, and plan around them, carefully.
Reducing risk when migrating your data center
What you need to know before adopting an SDDC
DCI demands grow in competitive market