Problem solve Get help with specific problems with your technologies, process and projects.

When do applications suffer from poor network performance?

Poor network performance causes applications to suffer in an enterprise. Find out at what point packet loss between client and server starts causing problems in this expert response.

Recently, I have experienced some network problems in my enterprise. In fact, I am an Oracle DBA but I am getting more and more inquisitive on how network problems affect my application. Basically, I have used the ping command to check the health of the network, taking metrics for over a week. The result was that 3.6% of packets sent from a client towards the DB Server are lost.

How can I understand if 3.6% is a bad value, or still acceptable? Could that value be bad for a certain type of application but good for another one?

Since I don't know where to start from, can you give me a general idea or some parameters I can investigate with the help of Google?

Lastly, I would like to point out that I don't need to see where and why the network is performing poorly; I would like to know when my application will suffer from poor network performance.

I'm so glad you asked this question. What a great one to ask! One of the best things that application developers can do to ensure their application performs well over a very diverse enterprise is to understand how the application calls affect the data behavior. As an illustration, I tend to call this the Tarzan affect. Most application developers tend to create applications assuming unlimited localnetwork access and do calls to databases/file servers in individual requests instead of bulk. This translates into what network engineers traditionally call a ping pong application. Individual requests to the database tend to exacerbate any latency or loss on the network because the client has to wait the entire network latency for one piece of information; double if lost in the initial transmission. If Tarzan is swinging across the jungle, he wouldn't go fetch a single banana one at a time and bring it back to the apes, would he? Nope. He would create a basket and bring back a bushel of bananas on a single trip. He would bring over as many as he could carry in one go. There's a certain danger to this as well so a modified approach is needed to learn how many bananas he can carry safely across the jungle.

What if Tarzan fell in the jungle and lost his entire bundle? This is where your question about loss comes into play. What happens to TCP-based connections if a packet is lost on the network? How do I recover my basket of fruit? We're going to keep this simple by assuming you're not using any specialized mechanisms for network optimization, but they're great things to investigate once you realize you might need their assistance in latency-sensitive and lossy environment. First, understand that TCP actually has a mechanism called TCP Slow Start that provides a method to attempt to learn how lossy the environment is, technically called congestion avoidance and positive acknowledgement. It slowly ramps up transmissions to the window size by sending packets in small increments to larger increments between waiting for an acknowledgement. (Think Tarzan starting off carrying two bananas to the apes, then four, then eight to learn how many bananas he can carry to the apes home safely.) If transmissions are lost during transmission, then the number of packets to be transmitted in the next section will be cut in half. TCP will throttle back the number of packets to be sent without waiting for an acknowledgement after detecting loss. There are some very classic saw-toothed diagrams on the Internet that illustrate this behavior.

Ideally, no loss is a great thing. But there are mechanisms inside of TCP to deal with loss better as long as the application developers allow TCP to do its thing. If you're coding such that you only request one data segment at a time, the TCP window size does not get the opportunity to increment and thus you're essentially transferring one banana across the jungle at a time. So be weary of that type of behavior. Packet loss of 3.6% in a ping-pong application is really bad because each packet lost must wait the entire network latency to be notified for retransmission. If the application is coded such that there is a TCP window sizing that is set high enough at default and packets are lost, there it is less likely that the users will feel the added effective network latency due to retransmission of data. So protect yourself. Also, a major point in this is to clearly distinguish that ICMP data sent across the network is not the same as application level traffic in packet size, distribution, or behavior. So when possible, it's always a very good idea to monitor the retransmissions and loss of application level data over ICMP traffic for more accurate understanding of how loss affects your application. And whatever you do, just remember it's a jungle out there!

Below is a list of keywords that you should google: TCP slow start, TCP selective acknowledgement, delayed acknowledgement, Nagle's algorithm, and congestion avoidance.

This was last published in April 2009

Dig Deeper on Network application performance

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.