There are a number of network analyzers available today that claim to perform Application Response Time (ART) measurements. In the world of network analysis, this is understood to mean how long it takes a server to return data for a command from the client -- the good old-fashioned client/server command/response model or transaction. For instance, consider the time between a HTTP "get" request and subsequent data returned from a web...
server. We could measure the time for the entire page to load, but the most critical measurement is the initial delay between every get and the first data packet to be returned. Objects on a Web page are many and small, and thus will entail numerous get requests to load the entire page.
After looking at hundreds if not thousands of transactions in trace files, I've discovered that many of the "slow" Web servers are those in which there is a large delay between the get and the first data packet to be returned. Once the data (for that one Web object) is returned, it typically flows reasonably well. In other words, once the server's application retrieves a chunk of data and hands it over to the transport layer (TCP) to return it to the user, the application is pretty much out of the picture.
Thus, a very critical part of ART is that first delay for every transaction -- how fast can the server process the request and return the information? Another interesting behavior I've learned is the TCP stack on the server is typically independent of the application and is very fast, which is the way it should be. Thus, even if the server is unable to process the request right away, TCP will still send an acknowledgement (ACK) back to the end user that the request has been received safe and sound. If the server is fast enough, the TCP ACK will be part of the data returned to the user. This is often referred to as "piggybacking" the ACK.
It order to measure ART accurately, the network analysis tool must understand the difference between receiving an ACK and receiving the first data packet. Unfortunately, this is not always the case. I've seen products that measure the ACK packets only, leading to very misleading results and interpretations, leading to erroneous response time figures and even expert system reports that miss the real delay.
Such products will accurately measure ART for servers and applications that are behaving normally, but they become inaccurate when there are serious problems, which of course are the very things we network engineers want to detect and troubleshoot! Such delays go totally undetected if the tool does not understand the difference between a command packet, which has been ACKed, vs. versus real data being returned from the server.
ART is not always easy to measure on an automated basis, since the analyzer doing it must separate packets that are the start of commands from those that are simply part of data transfer and are possibly bi-directional. Further, an analyzer that performs ART on all conversations or flows between users and servers should understand all protocols and applications, even if they are not "decoded" or are proprietary.
As network analysts doing our day-to-day job, we on the front line require accurate and verifiable ART. We must build confidence our analytical tools, especially when users are complaining of slow response times. To gain confidence in our tools, we should manually verify the numbers for every application of high business value to our organization, rather than relying on such data at face value.
Here's an example of how to verify ART data. Let's examine the following three packet summaries from a SQL transaction, beginning with a SELECT statement from the client.
Packet Source Destination Delta Time Relative Time Protocol Summary 43039 126.96.36.199 188.8.131.52 0.000000 TDS SELECT * FROM… 43044 184.108.40.206 220.127.116.11 0.122394 0.122394 TCP Flags= .A.... 43874 18.104.22.168 22.214.171.124 12.687890 12.810284 TDS Response STATUS=Last…
ART analysis should tell us that the response time to the SELECT statement is 12.8 seconds, the total time before we saw data from the server relative to the initial request, not the .122 seconds (122 milliseconds) that we see between the request and the return of the first packet, which is merely an ACK packet. Further, an expert system should alert us to this difference -- do not assume that if the ART data is correct, the expert system is, too.
Verifying and paying attention to such detail will give you the confidence that the ART you are looking at is the real deal, and not a forgery.
Since 1990, WildPackets has been delivering real-time fault analysis solutions that enable the world's leading organizations to keep their networks running securely and reliably, day after day. From the desktop to the datacenter, from wireless LANs to Gigabit backbones, on local segments and across distributed networks, WildPackets products enable IT organizations to quickly find and fix problems affecting mission-critical network services. WildPackets products are sold in over 60 countries through a broad network of channel and strategic partners. More than 5,000 customers across all industrial sectors use WildPackets products daily to troubleshoot networks and maximize network uptime. http://wildpackets.com
Key products include:
Omni³, distributed expert analytics platform for enterprise networks
EtherPeek NX, expert LAN analyzer
AiroPeek NX, expert wireless LAN analyzer