- John Burke, Nemertes Research
Troubleshooting network errors is a fact of life. Learning about the most common network errors and their potential...
causes will help you both identify and resolve problems more quickly, improving your ability to meet service-level agreements (SLAs) for your network.
Switches and routers register network errors, as do servers and PCs connected to the network. Although you can log into a device and check its network logs, more often you will use a network management tool of some sort that will use simple network management protocol (SNMP) or other protocols to collect logs from network devices to find problems. Sometimes, especially if you are working with unmanaged switches, you will use a network sniffer or protocol analyzer to dig into a problem in greater detail. Here are some of the most common network errors:
File check sequence (FCS) errors
Nodes that transmit Ethernet frames append an FCS number, which lets the receiving device determine if the packet is complete and correct upon arrival. The sending node calculates the FCS number using an algorithm called cyclic redundancy checking (CRC). The receiving node uses CRC to calculate its own FCS field value; if that number matches the one received, the frame is good. Where the values do not match, there is an FCS error.
FCS errors are most commonly caused by noise on the data network. Network noise can be created by cabling located too close to noise sources such as lights, elevator motors or other heavy machinery. Cabling that has not been pulled and terminated in line with the appropriate specifications can also generate noise. Too much wire left untwisted at termination -- or runs that are too long or bends that are too tight -- can introduce noise from external sources or from crosstalk among pairs. Poorly manufactured components can compound such problems.
Ethernet frames should be comprised of complete bytes -- octets of bits. In other words, the length of a frame in bits should always be evenly divisible by eight. When a frame doesn't meet that criterion, it has an alignment error. Alignment errors should always generate FCS errors. As with other FCS errors, alignment errors most often result from noise on the cabling, although hardware problems in network interface cards or other network hardware can also cause them.
Collisions and late collisions
These common network errors are separate anomalies with similar resolutions. Collisions occur when more than one device tries to use the network at the same time. This is increasingly rare. Today, nearly all networks are switched networks, which means each cable run connects one device to another device, with each device equipped with separate pairs to transmit and receive data (also called full-duplex mode). Since information is transmitted on separate pairs, data from one device cannot collide with data from the other. However, sometimes network ports are misconfigured as half-duplex. When this occurs, the ports will try to use the same pairs to transmit and receive data. This results in collisions, which can quickly become excessive in high-throughput environments. Switching the connection to full duplex solves the problem. (Duplex mismatch is a related issue: One end of a connection thinks it is on full-duplex; the other is set up as half-duplex, and as a result errors mount rapidly.)
All network devices can discard packets, and are expected to. For example, a switch can discard packets that arrive tagged for a specific virtual LAN (VLAN) on a port not configured for that VLAN. Most devices will discard packets when they run low on buffer memory. For example, if a high-definition video conference session consumes all the high-priority delivery bandwidth on a port, a router might discard lower-priority packets (e.g., those associated with an SMTP mail transfer session). Discards force TCP applications to resend packets, which increases application latency. Discards cause performance problems for UDP applications as well, typically in the form of audio or video artifacts. Some discarding is inevitable, but excessive discards can indicate that the switch is misconfigured (e.g., it should have a VLAN on it that it does not) or that the device sending to it is misconfigured (trying to send on the wrong VLAN). Excessive drops can also indicate that a port has insufficient bandwidth for its current usage profile. In that case, to solve this common network error the port needs to be upgraded or its traffic split across multiple links.
Unknown protocol errors
A switch or router can receive a packet whose meaning isn't understood. Usually, this is due to a receiving device having a particular protocol disabled when it is in fact needed, or the sending device does not disable the protocol when it ought to be. Such network errors are most common when a new device configuration is pushed out to one or both devices, or when new equipment is swapped in.
Frames that are too short (under 64 bytes, called runts), or too long (more than 1,518 bytes without a signal that a long frame is coming) or giant (more than 6,000 bytes in any circumstance) generate errors. These are almost always the result of hardware problems in network interfaces or software problems in the network stack; these common network errors are fixed by updating software or replacing hardware.
The history of network cable for network professionals
Book chapter: Troubleshooting your network
Everything you need to know about network cabling technology