This article from WindowsNetworking.com describes a structured approach for troubleshooting problems with TCP/IP networks. This is the first of a series of articles on TCP/IP troubleshooting, and future articles will focus on key issues highlighted in this article.
What do you think of when you hear the phrase "TCP/IP troubleshooting?" People who are visually imaginative may see a flowchart. More linear-minded types may see a series of numbered steps. Others (far too common) may feel a sense of inadequacy and frustration.
TCP/IP troubleshooting should be simple, right? After all, it's just a protocol -- a series of steps to transfer bits over the network. But what a protocol: four layers, with multiple protocols at each layer.
The traditional troubleshooting approach
Some years ago when I first learned about TCP/IP networking, I was taught a simple follow-these-steps approach to troubleshooting problems. The method went something like this:
I call this the "brain-dead approach" because it's so methodical you can basically turn off your brain and just follow the steps. It's also somewhat inefficient, for it automatically assumes that your problem most likely starts with your own computer and that the problem is more likely to be closer to you (your network card, your computer's IP address configuration, your local subnet) than further away (other subnets). And it's a method that was probably developed before the Internet really took off -- that is, before DNS became ubiquitous for name resolution and before firewalls and VPNs became a fact of life for most corporate networks.
What I mean is this: one of your users says "I can't connect to the server right now." What could be the problem? It helps to dissect this simple sentence to understand the issues that may be involved. For example:
"I can't…"
Is this the only user who has called in reporting network problems? If there are others, do they have similar issues? If s
To continue reading for free, register below or login
To read more you must become a member of SearchNetworking.com
');
// -->

o, then right away it's clear you don't need to take a brain-dead approach and begin your troubleshooting at the user's computer. Instead, the issue is most likely "out there" somewhere, and that could mean maybe your DNS server is offline or your DNS provider services may be experiencing difficulty. Or maybe a router on your internal network may be going crazy and dropping packets. Or maybe the server your users are trying to connect to may have crashed.
You should also stop and think about any commonalities these users may have. For example, are their machines all on the same subnet? If so, then maybe the default gateway for that subnet is misconfigured or the router crashed. Or maybe a contractor working in the plenum crawlspace has accidentally cut a network cable connecting the subnet's workgroup switch to the department's main Ethernet backbone switch. Or maybe someone malicious has installed a rogue DHCP server on that subnet and it's stealing machines as their leases come up for renewal and assigning them unroutable addresses to create a denial of service condition.
If it's only that one user though who has the problem, then it's probably time to play braindead and start asking questions like "OK, is your computer turned on? Is the network cable securely attached at the back of your machine?" and so on.
"…connect to…"
A good question to ask this user is "What do you mean by connect?" That's because "connect" is a technical-sounding word that users often use to impress Help Desk to show they know what they're talking about. Well, they usually don't. Why? Because there are different kinds of connectivity including MAC-level communications, TCP sessions, password-authentication, access rights and privileges, NAT-traversal connectivity, firewall pass-through, application-level sessions, and so on. What kind of connectivity problem are they actually having? What are they trying to do when they say they want to "connect to" the server? Are they trying to access a share on that server? Do they get an "Access denied" message when they do this? Are they getting a login box prompting them for credentials? Is it rejecting their credentials? Are they having trouble finding the share in Active Directory? Is it a mapped drive they are having problems with? Are they trying to browse to find the server in My Network Places? And so on.
And is it just that server they're having trouble connecting to, or are they having problems connecting to anything on the network? Determining the scope of the problem here is important: Is connectivity failing in just one way or many ways?
"…the server…"
You've got this user over here, and this server over there, and the network between. They can't connect. Why? Well, where exactly is that server anyway? Is it on the user's subnet? On an adjacent subnet? In a different department? On a different floor? In a different building? On a different continent? What kind of network connects the user with that particular server? A wired Ethernet LAN? A wireless LAN (WLAN)? A fractional T1 line? Frame Relay? A VPN tunnel over the Internet? A dial-up modem connection? Cable modem or DSL?
First determine the type of connection (possibly several types) between the user and the server, and then ponder where things might break down. Maybe the CSU/DSU has gone wonky; try recycling its power or contact your service provider who should be monitoring it. Maybe the janitor is cleaning the server room and he bumped a power bar and an Ethernet switch has gone offline. Check for an alert message from your network management software, assuming you're using managed switches. Maybe there's been a power blackout at the remote branch office where that server is located. Call them on the phone and see what's happening.
And is it server or servers? Is the user having trouble connecting to only that server or to other servers as well? Are others having problems connecting to other servers also? What are the commonalities (if any) between all the servers being affected? (Or apparently being affected -- remember, the problem may be with the users' computers or more likely with the network infrastructure itself.)
"…right now."
The time element is crucial in troubleshooting. Did the problem just start happening? When was the last time you successfully connected to the server? How long has it been going on for? Is it continuous or intermittent? Intermittent network problems involving unreliable WAN links and other issues can be difficult to troubleshoot, especially if they're transient, i.e. brief and occasional.
Time can also help you relate the problem to other circumstances that might be impacting your network. Did the problem start this morning at 10 am? What else happened on your network around then? Were patches applied by a WSUS server? Did scheduled maintenance on a domain controller occur? Was a construction crew in the building compound using a backhoe to repair a water main break?
A structured approach
My own approach to TCP/IP troubleshooting is structured around three critical areas:
Conclusion
Troubleshooting TCP/IP networks can be frustrating, but it can also be fun. In future articles we'll zoom in on the troubleshooting steps and tools you need to be able to do in order to successfully solve the issues that might arise on your network. Until then, stay connected!
About the author:
Mitch Tulloch is a writer, trainer and consultant specializing in Windows server operating systems, IIS administration, network troubleshooting, and security. He is the author of 15 books including the Microsoft Encyclopedia of Networking (Microsoft Press), the Microsoft Encyclopedia of Security (Microsoft Press), Windows Server Hacks (O'Reilly), Windows Server 2003 in a Nutshell (O'Reilly), Windows 2000 Administration in a Nutshell (O'Reilly), and IIS 6 Administration (Osborne/McGraw-Hill). Mitch is based in Winnipeg, Canada, and you can find more information about his books at his Web site: www.mtit.com.
[TABLE]WindowsNetworking.com contains a wealth of networking information for administrators: Featuring information on how to setup and troubleshoot various networks of any size. Also includes a comprehensive archive of hundreds of reviewed networking software and hardware solutions. Frequently updated with articles and tips by a team of leading authors, it remains a favorite within the networking community.