In my last tip on redundant WAN routers, I discussed the pros and cons of using one or two routers to connect a pair of WAN circuits. The two alternatives -- both circuits plugged into the same router or one router for each circuit -- are shown below in figures 1 and 2, respectively. In this tip, I go into some more detail about the actual routing in figure 2 and discuss a common failure scenario that is somewhat difficult to defend against: a segmented network.
|Figure 1 shows both circuits plugged into the same router.||Figure 2 shows one router for each circuit.|
Let's consider figure 2 in more detail. This is a logical view of the network at a site that shows a simplified example with a single Ethernet segment and IP subnet on the LAN. Your network is probably much more complex, with multiple virtual LANs (VLANs) and probably even Layer 3 switches in the LAN, but this is sufficient to illustrate the concept. Because it is a logical view, however, it doesn't show how many switches are used or how they are connected.
|Figure 3 shows two routers used with a single switch.|
This raises the question, "If I have redundant routers, should I also have redundant switches?" In other words, I could build my network like figure 3, where I have only a single switch, even though it may be a very large and expensive switch like a Cisco 6509 that may be doing routing. Or I might build my LAN like figure 4, where I have two switches that are both in the same VLAN and IP subnet.
|Figure 4 shows two switches in the same VLAN and IP subnet.|
There are many other alternatives, such as connecting both routers to both switches or making both switches in separate subnets, but the scenario I want to explore in greater detail is that shown in figure 4. For this discussion, let's assume that the IP subnet is 10.1.1.0 /24 and that prefix is being advertised from both routers toward the rest of our network. Let's further assume that we have users connected to both switches 1 and 2, and that routers 1 and 2 are using HSRP or VRRP to share the default gateway for the users.
So the potential failures are:
- either WAN circuit
- either WAN router (R1 or R2)
- either switch (Sw1 or Sw2), or
- any of the three Ethernet cables connecting the routers and switches.
In the case of a WAN circuit or router failure, the other will take over without difficulty. In the case of a switch failure, the users connected to that switch would experience an outage, but the other users would be fine. And in the case of the two router-to-switch cables failing, the other router would take over.
In the case of the cable failing between the two switches (as shown in figure 5), however, we can get into trouble. When this happens, the Ethernet ports on both routers are still alive and communicating, just not with each other. This means after a few missed HSRP "hellos," both routers will become active (as opposed to standby). You might think that isn't too bad, since it looks as if both switches have a live connection to the rest of the network, but the problem is actually worse than the ones above because the IP subnet is now segmented.
Figure 5 shows a cable failure between the two switches.
For instance, a PC on one switch cannot talk to a server or PC on the other because its local router will not route traffic for a directly connected subnet on the WAN, which is the only physical path. Further, if router 2 is a backup router, then if the PC on switch 2 sends a packet over the WAN through router 2, when that packet returns, it will probably follow the preferred route back to the 10.1.1.0 /24 subnet which is advertised by the primary router 1. That is, return traffic goes to the wrong router, which will transmit it onto switch 1 until its ARP cache times out.
At this point, you're in about the same boat as you would be if you'd lost an entire switch. But if, per our last discussion, you configured routers 1 and 2 to load-balance traffic across both circuits, then you're really going to have a rough time. What happens now will most likely be that traffic from PCs on both switches leaves the site just fine via the only route it has, but the return traffic will be load-balanced so that half of it will be lost.
The result will be calls from random users at the site complaining that they can get to some servers over the WAN but not others, while other users are just the opposite. And you won't be able to isolate the problem to a given switch or router. If you're not careful, these symptoms can take you on some wild goose chases.
Of course, in more realistic networks, there are lots of ways to prevent segmented subnets or networks, but in this simplistic network, the easiest thing to do is probably just to use two cables to connect switches 1 and 2, with either EtherChannel or Spanning Tree to prevent loops, so that if one cable gets cut, then the other will prevent the failure.
About the author:
Tom Lancaster, CCIE# 8829 CNX# 1105, is a consultant with 15 years of experience in the networking industry. He is co-author of several books on networking, most recently,CCSP: Secure PIX and Secure VPN Study Guide, published by Sybex.