Problem solve Get help with specific problems with your technologies, process and projects.

Troubleshooting switching: Overcome function overlap confusion

Switches can perform many very distinct functions, most of which have nothing to do with the academic definition of "switching." In order to troubleshoot your environment with a minimum of effort, you need to understand these functions -- how they work in isolation and how they interact with the other functions in the switch -- so you can eliminate the unlikely causes of failure and quickly get to the root cause.

In a previous tip, I discussed formal and informal troubleshooting methodologies using routing as an example. In this tip, I want to look at another dimension that is critical to timely troubleshooting -- a solid understanding of systems and their components, using switching as an example. An understanding of systems is critical to troubleshooting simply because you have to know what to peek and poke at, and though it's perhaps a statement of the obvious, it's becoming more important and more difficult by the day.

More specifically, many of the ways we attempt to envision today's complex networks (or explain them to non-technical parties) simply don't lend themselves to troubleshooting. Case in point: switching. In reality, your switches probably perform many very distinct functions, most of which have nothing to do with the academic definition of "switching." In order to troubleshoot your environment with a minimum of effort, you need to understand these functions -- how they work in isolation and how they interact with the other functions in the switch -- so you can eliminate the unlikely causes of failure and quickly get to the root cause.

This list of systems and components that provide extra functionality typically built onto the basic switching function is far from comprehensive, but it illustrates the point:

Addressing and routing

In order to know which port to use for forwarding a frame, the switch needs a database of MAC addresses. This is usually called the Forwarding Database (Cisco calls it the CAM). In addition, most vendors have proprietary ways of caching this information at the port to minimize demands on the CPU. There are plenty of technologies out there, such as Microsoft's Network Load Balancing, which when "properly" configured result in interesting behavior from your switch, like flooding frames out of all the ports.

More on this topic

Troubleshooting routers

Layer 3 switches explained

Securing your Layer 2 network: Don't overlook the basics

More routing & switching tips

Another example could be an inter-switch link that has multiple channels; it has to decide on which pipe to forward a frame. You have a choice of several algorithms the switch can use to make that decision. As with any choice, a suboptimal decision could be something you troubleshoot later.

Access and security

Features like IEEE's 802.1x Port Authentication and Cisco's Port Security and Layer 2 ACLs can be extremely useful in some circumstances, but they also present dozens of new ways to misconfigure devices or introduce unexpected downtime.

Port and switch configuration

A decade ago, switches that could automatically detect and set link speed and negotiate the duplex were pretty spiffy. TechTarget readers are no doubt well aware of the legendary problems this caused. Today's switches have many more protocols to detect and configure links. These are such things as Uni-Directional Link Detection (UDLD), port aggregation protocol (PAgP), and other negotiation methods built into specific technologies -- for example, the method built into the Power over Ethernet spec to keep it from sending power to a device that isn't expecting it. In addition to ports, components such as VTP can even configure other switches (VLANs, in this case).

Loop avoidance

Of course, we can't avoid mentioning spanning tree protocol (STP) and all the various recent incarnations like Per-VLAN Spanning Tree (PVST). The point of all this is that if you have a simple problem -- for instance, "I can't ping your PC from my PC and they're both on the same subnet, separated by several switches" -- then almost any of the components above could be the culprit.

  • There could be an Ethernet loop due to an STP issue or an MLT/Etherchannel link misbehaving.
  • Your PC may not be allowed onto the network because you didn't authenticate, or you put a hub at your desk and connected more PCs than Port Security permits.
  • Perhaps there are far more MACs on the network than can fit into a switch's FDB, and the subsequent flooding is causing congestion.
  • Perhaps QoS is configured to put ICMP into a low-priority queue, so regular traffic is working but I just can't ping you.

Hopefully, you've disabled all of the protocols you don't need; even then, there still could be dozens of possible causes of this problem. If you organize your thoughts about components by function, however, it will help you construct the questions and tests that ferret out the problem.

Tom Lancaster, CCIE# 8829 CNX# 1105, is a consultant with 15 years of experience in the networking industry. He is co-author of several books on networking, most recently, CCSPTM: Secure PIX and Secure VPN Study Guide published by Sybex.

This was last published in May 2006

Dig Deeper on Data Center Networking

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

SearchUnifiedCommunications

SearchMobileComputing

SearchDataCenter

SearchITChannel

Close