TCP Traffic Analyzer

Subbu Subramaniam


Introduction

Yconalyzer is a tool to monitor and analyze TCP connections on a particular TCP port. You can use yconalyzer to get answers to questions like:
Yconalyzer can be invoked in one of three modes:
When yconalyzer is invoked in capture mode, it captures relevant packets and saves it into the file specified by the -w argument. If  the -t <nseconds> option is not specified, yconalyzer captured packets for 60 seconds.When invoked in the analyze mode, it reads packets that have already been captured using yconalyzer (or using tcpdump, see section below) and presents statistics about the connections. These modes are useful for debugging performance or other problems on a machine (client or server).

In the Live traffic analysis mode, yconalyzer captures packets and prints out the statistics of the connection packets captured. This mode is most useful for periodic monitoring of  of connections on a machine (e.g. via cron jobs). The -r option can be used to pipe the output of yconalyzer to other sed, awk or shell scripts that may store the results, raise alarms, etc.

The -p option is mandatory for all invocation modes.

Yconalyzer captures packets (or read captured packets from files) using libpcap. It captures a minimal set of TCP packets off the network interface using BPF. Yconalyzer captures only those TCP packets that have the SYN, FIN and RST bits set in the TCP headers (see RFC 793). These should be much smaller in number than the other packets of the connection. Therefore, capturing this subset of packets and processing them should not impose a large overhead.

Another common utility that uses libpcap is tcpdump. Yconalyzer is fully compatible with tcpdump. It can analyze pcap files captured by tcpdump. Yconalyzer writes pcap files in the that can be read by tcpdump as well. In fact, the following two commands are equivalent:

tcpdump -w http.pcap '(tcp port 80)&&((tcp[tcpflags] & (tcp-syn|tcp-fin|tcp-rst)) != 0)'

yconalyzer -w http.pcap -p 80

Both of these commands capture connection control packets of  TCP from/to port 80 into a file called http.pcap (the argument to the -w option).  Note that this filter will include packets belonging to incoming as well as outgoing connections on a machine. In other words, the packets captured will include those connections in which the machine acts as a server on port 80, as well as those in which the machine is a client connecting to port 80 on other machines.

Once packets are captured using any of the commands as described above, you can run yconalyzer in the analyze mode to analyze the data. As in:

yconalyzer -r http.pcap -p 80

Using additional filters

The tcpdump command allows you to specify filters at the end of the command (such as the one illustrated). You can use additional filters to the yconalyzer command as well. Yconalyzer accepts any such filter that tcpdump accepts. Using additional filters can help when the CPU is so busy on the machine that even the restricted filter used by yconalyzer causes spikes, or causes yconalyzer to drop some packets (yconalyzer reports the number of dropped packets at the end of the run).

For example, to capture only incoming connections to port 80 on a host called proxy.foo.net (and ignore any connections that proxy.foo.net to initiate to port 80 on other hosts), you can use the following filter:

yconalyzer -w http.pcap -p 80 '(dst port 80 and dst host proxy.foo.net) or (src port 80 and src host proxy.foo.net)'

Read the tcpdump man page for details on constructing BPF filters.

You can also use filters when running yconalyzer with the -r option, if that helps in analysis.

Performance Overhead

When designing monitoring subsystems it is important to make sure that they do not take up more resources than the application that is being monitored, thereby skewing the measurements. The best way to monitor the performance overhead of yconalyzer is to run the top command. when yconalyzer is running. Most likely, you will not see it in the radar at all. Another way to tell if yconalyzer could not keep up with the traffic on the machine is the message that it prints at the end of the run, as below:

20545 Packets received
0 Packets dropped

As long as the number of packets dropped is reported to be zero, you can be sure that yconalyzer has captured all packets of interest. Even if yconalyzer reports dropped packets, you can analyze the captured files -- just that not all connections will be included in the analysis. It also means that your machine is running out of CPU. Therefore, any measurements you make may be skewed, and therefore may not reflect the true performance characteristic of the machine.

If you notice that yconalyzer is reporting dropped packets all the time, you should consider specifying additional filters in the command line. The narrower the filter, the lower the chances of dropping packets during capture.

Another alternative is to capture only some fraction of the connections. See example.

Presentation

When yconalyzer is invoked without specifying any special options, the output looks as below:
Default output

The first several lines of output display some general information about the information captured. Most of these are self-explanatory. Of most interest may be the lines that display the average number of SYNs taken to establish a connection (you want 1.0 in the ideal case), the number of connections terminated by RESET (depending on the way TCP is used by the upper level protocol, you want this number to be 0 as well), and also the line showing the distribution of SYNs. This line shows how hard the clients had to work to get connected to the server. In the example shown, 2233 clients got a connection in one attempt, 17 of them needed two, and 1 needed three attempts to get to the server. Depending on whether you run yconalyzer on the client side or the server side, you can draw appropriate conclusions. The next set of lines is in a tabular format, each line corresponding to a "bucket".

Buckets

By default, yconalyzer counts places each connection in a bucket (e.g. a connection that lasted 25 milliseconds will be placed in the 20-40 bucket), and displays the number of connections and the average client and server data size, and average throughput for all connections within that bucket. By default, yconalyzer reports data for 20 buckets, with a 20 ms bucket size.

You can specify a different bucket size using the -s option, or a different number of buckets using the -n option.

By default, yconalyzer uses the connection duration as the measure for a bucket. You can change this behavior with the -C option to use the number of bytes sent by the server as the bucket measure, or the -S option to use the number of bytes sent by the server as the bucket measure, or -P for using throughput as the bucket measure. Here is a partial output that uses the number of bytes sent by the client side as the measure of buckets:
Output using the -C option

As before, you can vary the bucket size or number of buckets using the -n or -s option.

Filtering by Volume

In the normal analysis mode, yconalyzer considers all connections on the port (along with any command line filters specified, of course). It is sometimes useful to eliminate certain connections by client or server volume (i.e. number of bytes sent by client or server) for analysis. You can use the options -X, -x, -Y, and -y for such filters:

Connection Details

While yconalyzer can give you summary information about the connections, it can also help you drill down into a specific connection. For example, in the default output, you can see that there are 19 connections that lasted between 0 and 20 ms in duration. If you want to identify these connections, you can run the same yconalyzer command with the -D 0 option. Yconalyzer will output extra information about each of the 19 connections making up the 0-20 bucket. A sample output is as below:
Details of connections within a bucket.

Note that there are 19 extra lines in the output, one for each connection in the bucket. You can then cut and paste any of these lines to the tcpdump command to get a packet trace for that connection. As in:

tcpdump -r smtp.pcap host 68.142.202.247 and host 74.52.168.160 and port 42834

Try it!

Other options to explore

Yconalyzer installs with a man page that describes all the options supported. It also prints a brief summary of options when invoked without the mandatory -p option. Here are some other options to try out:

Usage Examples

Capture: All connections for ten minutes

To capture all connection control packets (i.e. those with SYN, FIN or RST bits set) on port 80 for both incoming and outgoing connections for 10 minutes, and save the packets in a file http.pcap for future analysis:

yconalyzer -w http.pcap -p 80 -t 600

Capture: Only incoming connections

Unless an appropriate filter is specified, yconalyzer captures all connection control packets. On some servers, it may be that there are also connections initiated from the server to (say) a monitoring node periodically. Or, it may be that the server in question is a proxy, and we only wish to analyze the incoming traffic. Here is one way to capture only those connections on port 80 in which the machine (say, foo.bar.net) acts as a TCP server:

yconalyzer -w http.pcap -p 80 '(dst host foo.bar.net and dst port 80) or (src host foo.bar.net and src port 80)'

Capture: Only outgoing connections

In order to capture only outgoing connections on port 80 on a machine foo.bar.net:

yconalyzer -w http.pcap -p 80 'src host foo.bar.net and dst port 80) or (dst host foo.bar.net and src port 80)'

Capture: Sample 25% of incoming connections

Sometimes, capturing all connections on machine may cause yconalyzer to drop packets (see Performance Overhead). In cases where a server is fielding connections from clients spread all over the internet we can use the IP address as a randomizer to capture some fraction of the connections. The example below attempts to capture about 25% of the connections -- only those connections that have the last two bits set in the IP address.

yconalyzer -w http.pcap -p 80 '(((ip[19:1] & 3) == 3) or ((ip[15:1] & 3) == 3))'

This technique can also be used to randomize using  ephemeral port numbers (either local or remote side)

Analyze: Classify connections into different buckets

To analyze data in a captured file classifying connections into five buckets of 2-second duration each:

yconalyzer -r http.pcap -p 80 -s 2000

Analyze: Connections where client sends less than 4K bytes

yconalyzer -r http.pcap -p 80 -y 4096

Analyze: Connections where server sends more than 8000 bytes

yconalyzer -r http.pcap -p 80 -X 8000

Analyze: Percentage of connections that lasted less than 400 msecs

yconalyzer -r http.pcap -p 80 -n1 -s400

Analyze: Details on all connections of duration between 25 and 50ms

yconalyzer -r http.pcap -p 80 -n3 -s25 -D25

Frequently Asked Questions

How does yconalyzer work?

Yconalyzer gathers the connection control  packets of TCP (i.e. packets that have the SYN, FIN or RST bits set) and presents the information gathered. User can choose to view connection data using the different options accepted by yconalyzer. Users may also choose to capture data using tcpdump. See Introduction.

Why does yconalyzer perform better than tcpdump?

It does not. It all depends on the filter used when running tcpdump. If tcpdump is used with a narrow-enough filter, it would perform better as well! yconalyzer has in-built filter that is narrow enough to take up minimal amount of CPU. See Performance Overhead.

Why is yconalyzer is taking up so much CPU?

Yconalyzer works on the basis that a machine does less of connection-control activity and more of  other activities (database access, non-connection-control network activity, etc.). Not all machines are this way, though. As an example, HTTP proxy machines may not fit this model. In such cases, yconalyzer (or, for that matter, any packet capture software) needs to work a lot more. See Using additional filters.

Can I run yconalyzer on a file captured by tcpdump?

Certainly. You can have yconalyzer analyze any file captured by tcpdump -w. You can use the -r option of yconalyzer to analyze the file. See Introduction.

Does yconalyzer work on UDP-based protocols?

No. UDP does not have packets that clearly mark connection setup and tear down. Therefore, the only way UDP-based protocols can be analyzed is by capturing all packets. This may impose a significant performance overhead, depending on the application. If you have ideas, email me.

Bugs, Enhancements, Questions?

Please send all bugs, enhancements and queries to the appropriate forum for TCP Connection Analyzer.

Appendix I: TCP Basics

Transmission Control Protocol, specified in RFC 793, is a connection oriented protocol. The following diagram represents a typical TCP connection:
Typical TCP packet exchange
Note that ACK and DATA packets are represented separately in this diagram for simplicity. ACKs may be piggy-backed on data packets. Read the RFC for complete details.

TCP uses sequence numbers to identify each byte sent or received. The SYN packets have the initial sequence numbers chosen by either end. Each packet sent or received thereafter (by either side) has (logically) a sequence number identifying each byte in the packet.

Therefore, capturing packets that have the SYN or FIN bits set can give us a good bit of information about the connection, as in:
Yconalyzer uses this information to present information about connections.

Document made with KompoZer