Four months on from the Honeynet Project's formal announcement, the first wave of research output is starting to appear. The Project has been delivering on the promise — public papers, public tools, organised collaboration — at a pace that is faster than I had expected.
This is a short note on what has been published and what it changes for practitioners.
The first "Know Your Enemy" paper
The Project's flagship publication, Know Your Enemy, has just released its first major paper. It is approximately what I had been hoping for: a structured analysis of attacker behaviour, drawn from honeynet captures across several operators, with specific examples and statistical summaries.
The headline findings are roughly consistent with what my own honeypot data has shown:
- Most compromise attempts come from automated tooling.
- The threat-actor population has wide variance in skill.
- Outbound network restrictions disrupt most post-compromise activities.
- Persistence techniques are deployed by a minority of attackers but are operationally significant when they are.
The paper goes substantially further than I have, with specific captures, dwell-time statistics, and tooling fingerprints. The level of detail is appropriate for a research publication and is, I think, useful to defenders.
The writing style is honest about uncertainty. The paper is careful to note that the data is from a particular sample of honeynets and may not generalise to all defensive postures. This is the kind of calibrated humility I value in technical writing.
The new tooling
The Project has released a few specific tools:
snort_inline, a modified Snort that can drop traffic in addition to alerting. The intended use is on the boundary of a honeynet — letting suspicious traffic into the honeynet but blocking outbound attacks from compromised hosts to third parties. This is exactly the kind of network-containment problem I have been working on with my own honeypot v2; the inline-Snort approach is more sophisticated than my hand-rolled firewall rules.
Sebek, a Linux kernel module that captures keystrokes and command output from the host's network adapter, transmitted in a covert way that is not visible to standard Linux network tools. The capture goes to a separate sensor outside the host. The premise: even if an attacker installs a kernel rootkit, the keystroke data has already left the host before the rootkit can hide it.
Sebek is operationally clever. The module hooks the kernel's read syscall and writes copies of every read into network packets sent to a configurable destination. The packets are constructed to look like ordinary network traffic and are not visible to tcpdump or netstat on the source host. From the attacker's perspective, the host is not communicating; from the sensor's perspective, every keystroke is captured.
Honeyd, Niels Provos's tool for emulating multiple virtual hosts on a single machine. Earlier versions have existed; the Project's release is the first widely-tested version. Honeyd lets a single physical host act as a network of decoys at the IP level, with each decoy responding to scans as if it were a real host. The deployment cost of large-scale honeypot networks drops substantially.
What I am going to use
For my own honeypot v2 setup:
Sebek, as soon as I can deploy it. The keystroke-capture-via-kernel approach is more reliable than my current logging discipline, which depends on the host's own syslog being intact. Deploying Sebek requires patching the kernel and setting up a network destination; a weekend project.
snort_inline, on the firewall between the honeypot and the wider internet. The current firewall is a hand-rolled set of ipchains rules; replacing them with a Snort-based inline filter would let me write rules using Snort's familiar syntax and would simplify the maintenance.
Honeyd, eventually, for expanding from a single honeypot to a small honeynet. The single-IP observation is informative; the multi-IP aggregate would be more so. Probably a 2001 project for me.
What this changes for the broader field
Three observations.
The barrier to running honeynets has dropped substantially. A year ago, building a high-interaction honeypot was a multi-month engineering project. With the Project's tools, it is now a few-evenings exercise. The population of honeynet operators will grow correspondingly.
The data corpus will grow. As more operators deploy honeynets and contribute observations, the aggregate dataset becomes richer. The patterns visible in a single honeynet are observed; the patterns visible across hundreds of honeynets are measured.
The norms are forming. What counts as appropriate disclosure of honeynet captures, what data is shareable, what cooperation with law enforcement looks like — these were open questions at the start of 2000. The Project's leadership is establishing community norms by example. Future operators have a reference for what is and is not acceptable.
A small reflection
The Project is operating, in 2000, the way I had hoped academic-and-practitioner cooperation in security would work. Open output, careful methodology, sustained commitment, slow but real progress on hard problems.
This is not the dominant model in security research. The commercial security industry produces more output, faster, but with less rigour and less openness. The academic security industry produces more rigour but less practical relevance. The Honeynet Project occupies a useful middle space — empirical, careful, and operationally relevant.
For the next few years I expect the Project to continue to be one of the most useful places to read about defensive practice. Anyone working in this field who is not on their mailing list and reading their papers is, in my view, missing something important.
More as the year develops. The next two posts are likely to be about the IIS situation that is brewing — Microsoft has been notifying customers privately about a serious vulnerability that will be disclosed publicly in the next week or two.