How can I understand if 3.6% is a bad value, or still acceptable? Could that value be bad for a certain type of application but good for another one?
Since I don't know where to start from, can you give me a general idea or some parameters I can investigate with the help of Google?
Lastly, I would like to point out that I don't need to see where and why the network is performing poorly; I would like to know when my application will suffer from poor network performance.
Requires Free Membership to View
What if Tarzan fell in the jungle and lost his entire bundle? This is where your question about loss comes into play. What happens to TCP-based connections if a packet is lost on the network? How do I recover my basket of fruit? We're going to keep this simple by assuming you're not using any specialized mechanisms for network optimization, but they're great things to investigate once you realize you might need their assistance in latency-sensitive and lossy environment. First, understand that TCP actually has a mechanism called TCP Slow Start that provides a method to attempt to learn how lossy the environment is, technically called congestion avoidance and positive acknowledgement. It slowly ramps up transmissions to the window size by sending packets in small increments to larger increments between waiting for an acknowledgement. (Think Tarzan starting off carrying two bananas to the apes, then four, then eight to learn how many bananas he can carry to the apes home safely.) If transmissions are lost during transmission, then the number of packets to be transmitted in the next section will be cut in half. TCP will throttle back the number of packets to be sent without waiting for an acknowledgement after detecting loss. There are some very classic saw-toothed diagrams on the Internet that illustrate this behavior.
Ideally, no loss is a great thing. But there are mechanisms inside of TCP to deal with loss better as long as the application developers allow TCP to do its thing. If you're coding such that you only request one data segment at a time, the TCP window size does not get the opportunity to increment and thus you're essentially transferring one banana across the jungle at a time. So be weary of that type of behavior. Packet loss of 3.6% in a ping-pong application is really bad because each packet lost must wait the entire network latency to be notified for retransmission. If the application is coded such that there is a TCP window sizing that is set high enough at default and packets are lost, there it is less likely that the users will feel the added effective network latency due to retransmission of data. So protect yourself. Also, a major point in this is to clearly distinguish that ICMP data sent across the network is not the same as application level traffic in packet size, distribution, or behavior. So when possible, it's always a very good idea to monitor the retransmissions and loss of application level data over ICMP traffic for more accurate understanding of how loss affects your application. And whatever you do, just remember it's a jungle out there!
Below is a list of keywords that you should google: TCP slow start, TCP selective acknowledgement, delayed acknowledgement, Nagle's algorithm, and congestion avoidance.
This was first published in April 2009
Network Management Strategies for the CIO

Join the conversationComment
Share
Comments
Results
Contribute to the conversation