Reading SYN cookie code in detail

I mentioned SYN cookies briefly in February when I first read the Linux network stack, and again more recently when I wrote about flood defences. They are clever enough that they deserve a post of their own. The implementation is short, the idea is elegant, and reading the actual code is the cheapest way to develop a real understanding of an idea everyone half-remembers from textbooks.

This post walks through the SYN cookie code in Linux 2.2, specifically net/ipv4/syncookies.c. The code is about 200 lines. The interesting part is much shorter.

What problem SYN cookies solve

A TCP connection establishes itself with a three-way handshake. The client sends SYN. The server replies with SYN-ACK and allocates a small data structure to remember the half-open connection. The client sends ACK, completing the handshake.

Between steps two and three, the server is holding state for a connection that is not yet fully established. This state is in a finite-size queue. If the queue fills up, new SYNs cannot be added, and legitimate clients cannot complete the handshake.

A SYN flood attack exploits this. The attacker sends SYNs as fast as possible, with spoofed source addresses so the eventual SYN-ACK goes to a non-existent host. The server replies with SYN-ACK and allocates state. The attacker never sends ACK. The state sits, taking up queue space, until it times out — typically tens of seconds. By then the attacker has sent many more SYNs.

The queue fills. Legitimate clients are refused. The server is denied service.

The clever idea

The Linux SYN cookie implementation, due originally to Dan Bernstein, asks: what if the server did not allocate state at all when a SYN arrived?

The trick is to construct the SYN-ACK such that the eventual ACK can be verified without remembering anything from the SYN. All the necessary information is encoded into the initial sequence number that the server includes in its SYN-ACK.

If you can do this, the queue exhaustion attack stops working. The server has no queue to exhaust.

The question is: can you encode enough information into a 32-bit initial sequence number to verify a future ACK? Surprisingly, the answer is yes — at the cost of some lost functionality and a clever construction.

What gets encoded

The initial sequence number is 32 bits. The SYN cookie scheme uses these bits as:

Top 5 bits: a counter that ticks every minute. Used to verify the cookie is recent.
Next 3 bits: an encoding of the MSS (maximum segment size) the server selected. There are eight possible values it can choose from a fixed table.
Bottom 24 bits: a cryptographic MAC of the connection identifiers (source IP, source port, destination IP, destination port) plus the counter, computed using a server-side secret.

The MAC is a truncated SHA-1 (or in some implementations, MD5) of the concatenation. The server-side secret is regenerated periodically and never sent on the wire.

When the server receives an ACK that completes a handshake, it can:

Take the ACK number minus 1 (which is the original SYN-ACK sequence number plus 1, hence the original ISN).
Extract the counter, MSS index, and MAC.
Verify the counter is recent enough to be valid.
Recompute the MAC from the connection identifiers and the secret.
Compare the computed MAC to the embedded one.

If they match, the ACK is from a legitimate client whose original SYN the server saw — even though the server did not remember the SYN. The connection can proceed.

Reading the actual code

The key function is secure_tcp_syn_cookie():

static __u32 secure_tcp_syn_cookie(__u32 saddr, __u32 daddr,
                                   __u16 sport, __u16 dport,
                                   __u32 sseq, __u32 count, __u32 data)
{
    __u32 hash[4];

    /* combine the addresses, ports, count and data */
    hash[0] = saddr;
    hash[1] = daddr;
    hash[2] = (sport << 16) + dport;
    hash[3] = count + data;

    /* mix with the secret */
    syncookie_hash(hash, syncookie_secret);

    /* the cookie is the high bits of the resulting hash */
    return (count << 24)            /* counter, 8 bits */
         | ((data & 0x07) << 21)    /* MSS index, 3 bits */
         | (hash[0] & 0x001FFFFF);  /* MAC, 21 bits */
}

This is more or less what I just described. The structure is straightforward; the cleverness is in the choice of bit layout and in the discipline of never trusting the cookie alone.

The verification function is the inverse:

static __u32 check_tcp_syn_cookie(__u32 cookie, __u32 saddr, __u32 daddr,
                                  __u16 sport, __u16 dport, __u32 sseq)
{
    __u32 count = ...;
    __u32 mssind = (cookie >> 21) & 0x07;

    /* recompute the cookie and compare */
    __u32 expected = secure_tcp_syn_cookie(saddr, daddr, sport, dport, sseq, count, mssind);
    if (expected != cookie) return -1;

    /* check the counter is recent */
    if (count < ...time-window) return -1;

    /* return the MSS the server originally chose */
    return msstab[mssind];
}

The MAC is the gating check. Without the secret, an attacker cannot construct a valid cookie. With the secret, the cookie is unforgeable for any connection identifiers the server has not seen.

What is lost

A few things, none catastrophic.

TCP options are lost. The original SYN may have included options — selective acknowledgement, window scaling, timestamps. These are normally remembered along with the SYN; with cookies, they are not. The connection has to fall back to defaults for these options, which may slightly hurt performance.

The MSS is constrained. The server can only choose from eight predefined MSS values, because only three bits encode the choice. This is fine for almost all real traffic.

The counter has limited resolution. The counter ticks every minute; cookies older than the counter window are rejected. This is fine for almost all real handshakes, which complete in milliseconds.

These trade-offs are acceptable in the SYN-flood-mitigation context. SYN cookies are typically enabled only when the SYN queue is filling — so a normal connection sees the standard handshake, and the cookie path is taken only when the alternative is no service at all.

What I take from reading this

The specific clever thing here is the use of cryptographic state-on-the-wire to avoid state-in-memory. This is a pattern that shows up elsewhere — JWT tokens, signed cookies, pre-computed authentication tokens. The shape of the trick is general.

The second thing is that, having read the code, I now understand why SYN cookies are a partial defence rather than a complete one. They prevent queue exhaustion. They do not prevent a flood from saturating the network or the kernel's ability to process packets. A SYN cookie host under a sufficiently large flood still fails — just for different reasons than the queue overflow.

The third thing is the engineering virtue of small, focused functions. The Linux SYN cookie code is short. Each function does one thing. The naming is honest. The bit-fiddling is commented. This is what good kernel code looks like; it is, by and large, what kernel code does look like.

Reading kernel source for an evening is, dollar-for-dollar, the highest-leverage activity I do. There are not many domains where two hours of reading produces this much improvement in mental model. Defence-related kernel code is one of them. I would recommend it to anyone working in this space.