This tip is the second in a four-part series on things network engineers need to know to be effective. Part one...
covered what network engineers need to know about technology basics.
One of the most critical things a network engineer needs to know is how to make changes, because, well, changes are what network engineers do, and most of the networks you'll work on are not greenfield sites where you can play at your leisure. You'll have lots of hardware already in place and you'll have to work around it. Sometimes this is easy, but certain things can make it very complex, such as very old critical hardware you can't replace, or having to make your changes to a live network with no disruption to the users. In this article, we'll survey some of the decisions you need to make and how they can help or hurt your chances of success.
The first thing to do when planning a change is to identify all the tasks that need to be performed in order to complete the change and then divide them into two groups: things that can be performed before your "change window" and things that must be performed during the actual change. The term "staging" generally means the act of doing all the activities in the first group.
If your change is installing a new Ethernet switch that will replace an existing one, and your organization has given you one hour at midnight on Saturday to shut down the network and complete the change, then some examples of things you could do ahead of time might include unpacking the new switch, mounting it in a rack, labeling all the ports and wires, plugging in the electrical cable, and configuring it. During the window, you'd do all the things that are disruptive, such as moving the cables, testing and cleanup.
Obviously, the more activities you can complete ahead of time, the less risk you'll run during the change. And if you do encounter a problem, effective staging gives you more time to solve it.
Testing and backout plan
One of the cleanup activities you'll need to complete after making a change might be unplugging the old switch and removing it from the racks. It's good to keep your data centers and wiring closets as clean and simple as possible, but in many cases, it's better to leave the old box in for a while as part of your backout plan.
For every change you make, you should have a test plan and a backout plan. The test plan in the example above might be as simple as plugging in your laptop and pinging a server to verify connectivity. Most changes are much more complex, however, and usually involve many steps. Your chances of success are much greater if you perform several simple tests along the way, rather than waiting until you think you're done and discovering that something doesn't work.
This is where the backout plan comes in. When the change goes sideways for whatever reason, you need to be able to put the network back as it was, restoring service to the users, until you can figure out what went wrong and what to do about it. In theory, the backout plan can be as simple as your change steps listed in reverse, but the important thing to understand is when to back out. This is why your testing and backout plans go together. If you have an hour, and it takes you 40 minutes to make your change and 10 minutes to figure out that it didn't work and you can't fix it, you're left with only 10 minutes to undo something that took 40 minutes to do in the first place. Math like that can be career-limiting.
Local or remote changes
Another challenge for network administrators is making changes in remote locations. This is inevitable, because you can't be on both ends of a WAN circuit at the same time, and in the current business climate, having extra people to help you is a rare luxury. So, you need to think about how you're going to make changes remotely and do testing remotely. More important, you need to analyze the steps in your change plan thoroughly to make sure that none of them results in a lack of connectivity to your remote components. For instance, if you're Telnetting to a remote router and make a change to the interface you're connected to, you may disable the interface and prevent yourself from reconnecting to fix it. For issues like this, it's best to have an out-of-band method of access, such as a modem connected to the console port, or the name of someone on-site who can help you quickly.
Remember: Do as much as you can ahead of time, identify your "go/no-go" times so you can watch the clock during the change, and make sure you have the tools and assistance you need to do remote configuration.
About the author:
Tom Lancaster, CCIE# 8829 CNX# 1105, is a consultant with 15 years of experience in the networking industry. He is co-author of several books on networking, most recently,CCSP: Secure PIX and Secure VPN Study Guide, published by Sybex.