tcpdump: reading the wire

I have spent the last two weeks driving tcpdump seriously for the first time, and I want to write down what I have learned, because I think it is a tool every person who works with networks should be able to use.

tcpdump is a packet capture program. It listens to a network interface and prints out the headers — and optionally the contents — of every packet that passes by. It has been around since the early 1990s. The command-line interface has not really changed since.

It sounds dry. It is, in practice, the single most clarifying tool in the whole Linux toolbox.

The basic usage

tcpdump requires root, because it puts the network interface into promiscuous mode (or at least asks the kernel for raw access to it). The simplest invocation is:

tcpdump -i eth0

This prints, for every packet, a line that includes the timestamp, source and destination addresses, protocol, ports, flags, and a brief summary. It looks something like this:

14:23:01.234567 IP 192.0.2.1.54321 > 198.51.100.5.80: Flags [S], seq 100, win 5840, length 0

In one line: a TCP SYN packet from my machine, port 54321, going to port 80 on a remote host. No data yet. Sequence number 100. Receive window 5840 bytes.

Reading this fluently is a skill. It pays back enormously the moment you start getting unexpected behaviour from any application that uses the network.

Filters, which is the killer feature

The useful thing about tcpdump is not that it captures packets. It is that it captures the right packets. The filter language — Berkeley Packet Filter expressions — is small, regular, and lets you pick out exactly what you want.

A few examples I find myself using often:

# All HTTP traffic to or from anywhere
tcpdump -i eth0 'tcp port 80'

# Only DNS
tcpdump -i eth0 'udp port 53'

# All traffic involving a specific host
tcpdump -i eth0 'host 192.0.2.7'

# Just the SYN packets, which is what scanners send
tcpdump -i eth0 'tcp[tcpflags] & tcp-syn != 0 and tcp[tcpflags] & tcp-ack == 0'

That last one is worth understanding. Port scanners typically work by sending TCP SYN packets to a target's ports and recording which ones reply. If your firewall or your IDS is going to detect this, knowing how to filter for the same pattern by hand is the foundation.

What I have actually used it for

Three things, mostly.

First, debugging my own firewall rules. When something I expected to work was not working, tcpdump -i ppp0 'host my-server' told me whether the packet was even arriving, and if so what was happening to it. About half the time the answer was "the packet is arriving but the response is not being generated", which pointed me at an application config issue rather than a network one.

Second, watching what my own machine actually does in the background. There is a remarkable amount of traffic that a default Linux box generates that I did not know about — DNS lookups, ARP requests, broadcasts. None of it is malicious. A lot of it is information leakage I had not thought about.

Third, when DTK caught some interactive sessions on its fake FTP, I had tcpdump running in parallel, recording the actual packet stream. The session reconstructed from the packet capture told me more about the attacker's tooling than the DTK log alone.

A note on the legality and ethics

tcpdump will, by default, capture every packet on the wire your machine can see — including packets that are not addressed to your machine, on shared media. On a switched network this is mostly only your own traffic. On a hub or a wireless bridge it can be other people's. You should never run it in promiscuous mode on a network where you do not have authorisation, and it is worth thinking carefully about what "authorisation" means in any environment that is not entirely yours.

What I keep returning to

The single most important thing tcpdump has taught me is that the network, for all its abstraction, is ultimately a stream of well-defined events. Every problem you have on the network is, in principle, visible in the packet stream. Every claim a piece of software makes about what it is doing can be checked against what it is actually doing.

This is exactly the property that makes intrusion detection possible at all. If the events on the wire are observable, then in principle they can be analysed for patterns of attack. That is what Marty Roesch's new IDS, Snort, does. And Snort is the thing I want to write about next.


Back to all writing