The University of Minnesota DDoS: a category-defining incident

Last week, the University of Minnesota's network was hit by a Trinoo-style attack. A single target host — reportedly a server in the College of Engineering — was knocked off the network for over two days. The attack came from at least 227 compromised hosts, of which 114 were on Internet2.

This is, on the public record I can find, the first widely-publicised distributed denial of service attack. It is precisely the scenario I wrote about in June — and, more loosely, in January — but with the abstraction filled in by hard reality.

What we know about the attack

The public reporting is sparse — universities are understandably cautious about details — but the broad outline is:

The target was a single server. The attacking hosts numbered in the hundreds.
The attack used UDP flooding, consistent with Trinoo's documented capabilities.
The aggregate bandwidth was sufficient to saturate not just the target's link but, at times, the upstream segment serving the campus.
The attack persisted for over two days, with the attacking hosts being rotated as defenders identified and contacted operators of compromised hosts to take them offline.
Some of the attacking hosts were inside Internet2, which is a high-bandwidth research network. The bandwidth available to those compromised hosts was, by 1999 standards, enormous.

This last point is the structural problem. The attacking hosts collectively had more bandwidth available than the target's site could possibly absorb. There was no defensive configuration the target could apply at its own edge that would solve the problem.

What the response looked like

From the accounts I have read, the operational response was essentially manual:

The target's operators identified the attack pattern and confirmed it was a flood from many sources.
They started identifying the source IPs and contacting the operators of those networks.
The contacted operators investigated their hosts, found the Trinoo daemons, and removed them.
As hosts were cleaned up, the attack volume reduced. As the attacker presumably commanded fresh hosts to start flooding, the volume came back up.
After about 48 hours, enough sources had been cleaned that the attack effectively ended.

The tools used in the response were the telephone, email, and the manual application of access lists at the upstream level. There was no automation. There was no pre-existing inter-network coordination mechanism. Every conversation with every operator was an individual phone call.

This is roughly how anyone would have predicted the response to look two years ago. It is also clearly inadequate for any future where this kind of attack becomes routine.

What needs to exist

A few things that this incident has made obvious.

A real-time inter-ISP communication channel for incident response. Some kind of authenticated, low-latency way for operators to share "these IPs are attacking me" with their peers. Today, the channel is phone calls to people you happen to know. Tomorrow, it needs to be a structured protocol with verifiable provenance.

Coordinated upstream filtering. When the attack reaches the target's edge, it has already crossed the carrier networks. The most effective place to filter is at the carrier — but the carrier needs to know what to filter. A standard for ad-hoc filter requests, with mutual authentication, is what is needed. It does not exist.

Better detection of compromised hosts before they are used. The 227 attacking hosts in the Minnesota attack were compromised days or weeks before the attack itself. If their operators had been able to detect the compromises — through outbound traffic analysis, through honeypot stings, through behavioural anomaly detection — the attack tools would not have had a substrate to run on.

Better protocols at the IP layer. Source-address validation (egress filtering) is the most cited intervention. There are others — backscatter analysis to identify amplifier networks, BGP-level traceback to identify actual sources, packet-marking schemes for cryptographic provenance. None of these are deployed at scale. Most are in early research.

All of these are years away from being operational realities. The attack tools, meanwhile, are now demonstrably effective.

What an individual operator can do today

For my own scale of operation — a single uplink, modest budget — the answer is essentially: not much.

I can ensure my own hosts are not compromised, so that I am not contributing to the problem. I have applied the egress filtering advice to my own network. I have audited my hosts for known Trinoo daemon binaries (no hits). I run Snort with rules for outgoing traffic that look like Trinoo's command channel.

If someone targeted my own infrastructure with a Minnesota-scale attack, I would lose. My uplink would saturate; my upstream might or might not help; the attack would continue until the attacker decided it was done. I have no defence against this. Most operators do not.

What I think this incident does to the threat model

A few things, in increasing order of consequence.

It establishes the category. Distributed denial of service is no longer a theoretical concern. It is a thing that has happened, that knocks targets offline, and that is hard to defend against. The next attack of this kind will not be a surprise.

It will be repeated, and the bar will rise. Now that the technique is publicly demonstrated, more attackers will use it. The host-counts will increase. The bandwidths will increase. The duration will increase. The time horizon for serious response infrastructure to be built is, frankly, shorter than the infrastructure-building timeline.

It changes what 'compromise' means at the host level. Until last week, a compromised host was mostly bad for its operator — they had a backdoor, lost data, stolen credentials. Going forward, a compromised host is also a weapon — a piece of someone else's attack arsenal. This changes the externality of poor host security; my carelessness about my hosts is now your problem, not just mine.

It will, eventually, force protocol-level changes. The question is whether the changes happen by deliberate design or by emergency response under sustained attack pressure. The former is harder politically; the latter is harder operationally. Neither is fast.

For the rest of 1999, I expect to see at least one more high-profile incident of this kind, possibly several. By next year, I would not be surprised to see commercial sites taken down by the same technique. The defensive infrastructure is going to be playing catch-up for a long time.

Writing this on the Sunday following the Minnesota incident, with the immediate noise dying down. The longer-term consequences are going to dominate the discipline for years.