Honeypot range expansion: from one IP to a /28

Following the Honeynet Project's tooling becoming more usable, I have expanded my honeypot deployment from a single IP to a small range using Honeyd. This was prediction 19 in my 2001 list — done by end of March, ahead of the June deadline.

The setup

I have, through arrangement with my ISP, a small block of 14 unused public IPs (a /28 minus my own real allocations). Honeyd, running on a single physical host, presents a different virtual host on each of those IPs. From outside, the range looks like 14 distinct hosts, each with its own services and behaviour.

Honeyd is configured with several persona templates:

A Linux web server with a small set of static pages and a simulated CGI directory.
A Windows file server that responds to NetBIOS queries and returns plausible share lists.
An old Solaris box with several services from a bygone era — telnet, finger, RPC.
A bare "router" that responds to SNMP queries with plausible router-style values.
A Windows NT desktop with default services.
A Cisco router that emulates an IOS administrative interface.

Each persona has TCP/IP stack characteristics matched to the OS being emulated, so nmap fingerprinting returns the right OS for each. The realism is, on first inspection, convincing.

What the range shows

Three months of single-IP data showed a particular pattern of scanner activity. The /28 expansion has revealed how coordinated the scanning actually is.

When a scanner hits one of my IPs, it almost always also hits adjacent IPs in my range — often within seconds, sometimes from the same source IP, sometimes from different sources that are clearly part of the same scanning campaign. The range coverage by individual scanners shows specific patterns:

Linear sweeps. Some scanners go through the range in IP order, hitting every address. These are typically simple scanners.

Random sampling. Other scanners pick a few addresses at random from the range, then move on. These are typically more sophisticated.

Campaign coordination. Some IPs are hit by multiple sources within minutes, where each source hits a different subset. This is the coordinated scanning pattern I observed earlier — confirmed at finer resolution with the range deployment.

What the persona variation shows

Different personas attract different attention.

The Linux web server gets the most attention. Roughly twice the scan volume of the average IP in the range. Web vulnerabilities are widely scanned for; the substrate is large.

The Windows desktop gets credible Sub7-style probes. Specific port-knocking patterns I have seen in trojans show up against the Windows persona but not against the Linux ones. The attackers are picking targets based on the OS fingerprint.

The Solaris persona gets probed for older vulnerabilities. Specifically the RPC services and old telnet exploits that were patched on modern systems years ago. The attackers running these probes are presumably maintaining databases of legacy systems.

The Cisco persona gets specific router-targeted probes. SNMP community-string brute-force; specific known-default credentials; HTTP server exploitation attempts. The fingerprinting matters; the attackers are targeting based on what they think they are looking at.

This is the data the single-IP setup could not produce. Fingerprinting attracts attention from attackers who care about the target type.

Operational notes

Honeyd is reliable. Three weeks of operation, no crashes, no measurable performance issues on modest hardware.

Logging is comprehensive. Honeyd writes detailed logs of every interaction. The volume is substantial — about 10MB per day at my scan rate — but is straightforwardly archived.

Integration with my Snort sensor was straightforward. The existing Snort sensor sees the inbound traffic to all 14 IPs; the alerts are tagged by destination IP, so I can correlate Snort alerts with Honeyd interactions cleanly.

The compute requirements are modest. A 200MHz Pentium with 64MB of RAM handles the load comfortably. Honeyd is not the bottleneck.

What I want to do next

Three extensions on the list.

Add high-interaction honeypots behind some of the personas. Currently every persona is a Honeyd-emulated low-interaction service. Adding my real high-interaction setup behind specific personas would let me capture both the breadth of scanning and the depth of post-compromise activity on the more interesting probes.

Deploy Sebek on the high-interaction hosts. The kernel-level keystroke capture will catch attacker activity even if they install rootkits.

Contribute the data to the Honeynet Project. Sanitised captures from my range, in the standard format, are exactly the kind of data the cumulative analysis paper needs.

More as the data accumulates.