Honeypot summary 2005 H1 · Peter Bassill

H1 2005 was relatively quiet for the honeypot range. The continuing mass-mailing wars (MyDoom variants, Bagle variants, Netsky variants) produce volume; new categories of attack are slow.

The most interesting captures were phishing-related — increased traffic suggesting credential-targeting activity rather than pure malware propagation. This post is a longer summary than my recent ones; the patterns are worth careful description.

The volume

In rough numbers comparing H1 2005 with H1 2004:

Total connection attempts: roughly 20% increase year over year.
Distinct source IPs per month: similar to 2004.
Mass-mailing attempts: roughly 30% increase, dominated by Bagle and MyDoom variants.
SSH brute-force attempts: roughly 50% increase. This is the largest year-over-year change.
Sebek captures of human-attacker activity: 8 sessions in H1 2005, similar to historical baseline.

The overall volume continues growing. The mix is shifting; the structural elements continue to mature.

The mass-mailing baseline

The Bagle/MyDoom/Netsky variants continue to produce substantial mail volume. By my measurement, mail-borne worm propagation attempts represent approximately 8% of inbound mail volume on my relay — down from peak 2004 levels but still substantial.

The variants are uneven. Some weeks see specific Bagle variants surge; other weeks see MyDoom-related traffic dominate. The cumulative volume is consistent; the specific composition fluctuates.

The filtering disciplines I deployed in 2000 continue to catch essentially all of this volume at the gateway. The recipients see very little. The operational cost is bounded; the catch rate is high.

For operators without similar filtering: the residual mass-mailing traffic represents a substantial fraction of inbound mail. The cleanup workload, while not as dramatic as during peak worm windows, is continuous.

SSH brute-force growth

The most operationally interesting trend in H1 2005 has been the growth of SSH brute-force traffic. The volume has roughly doubled compared to mid-2004; the patterns have become more sophisticated.

Three specific patterns visible:

Distributed brute-force. Many sources, each making a small number of attempts against the same destination. The aggregate is a substantial brute-force attempt; per-source rate is below typical detection thresholds. This pattern is structurally similar to the coordinated scanning I wrote about — multiple compromised hosts coordinating to make small per-source contributions to a large attack.

Credential-list-based attacks. Earlier brute-force attempts used dictionary attacks (testing common words and known-bad passwords). Recent attempts use credential lists — specific username-password combinations that have been harvested from elsewhere. The attempts succeed when targets reuse credentials across services, which many users do.

Targeted brute-force against specific users. Some attempts focus on specific usernames (often common system accounts: root, admin, oracle, postgres) with carefully-tuned password lists. The targeting suggests the attackers have done some prior reconnaissance on the host.

The defensive response: rate-limiting at the firewall (which my iptables recent rules handle), key-based authentication only (which I adopted in 2000), and monitoring for the specific brute-force patterns.

Phishing-related captures

The most interesting honeypot captures of H1 2005 have been related to phishing infrastructure rather than to traditional worm activity.

Specific captures:

Reconnaissance for phishing-friendly hosting. Several distinct scan campaigns appear to be looking for hosts with specific characteristics — fast network connections, weak SSH credentials, accessible web servers. These are presumably reconnaissance for compromising hosts that can be used to host phishing pages. The compromise-and-reuse pattern is now operational.

Compromise attempts followed by phishing-page deployment. One Sebek capture showed an attacker who, on gaining shell access, immediately attempted to deploy a phishing page targeting a UK bank. The deployment failed because of my outbound filtering blocking the necessary external resources, but the intent was clear. This is the chain-compromise pattern extended to phishing — compromise host, deploy phishing infrastructure, extract value through subsequent fraud.

Credential-harvesting from compromised hosts. Two captures of attackers running scripts to harvest credentials from compromised hosts. The credentials sought were varied: SSH keys, mail-server passwords, database credentials, browser-stored credentials. The harvesting is automated and thorough.

These captures suggest the threat actor population is increasingly integrated — compromise activity, phishing operations, credential harvesting, and follow-on fraud are connected aspects of the same commercial-cybercrime infrastructure.

What this teaches

Three observations.

The threat infrastructure continues to professionalise. The integration between compromise activity and phishing operations is novel. The credential-harvesting techniques are mature. The patterns suggest operational connections that did not exist a few years ago.

The defensive disciplines remain effective. Outbound filtering continues to disrupt most attack workflows. The captured attempts that failed because of outbound restrictions confirm the pattern I have been documenting for years.

The category mix continues shifting. Pure worm propagation is becoming a smaller fraction of activity; targeted compromise and credential operations are becoming larger fractions. The threat model needs to keep adjusting.

What is in H2

Looking ahead to the rest of the year, three things I am watching for:

The next major Microsoft worm. Zotob in August was modest by historical standards. A larger one is overdue. The MS05-series advisories continue; one will produce a major worm before year-end.

Continued phishing-infrastructure development. The integrated phishing-and-compromise pattern will continue maturing. Specific incidents involving compromised hosts being used for phishing will become more common.

Possible new categories. The Sony BMG situation (in August/October, depending on exact disclosure timing) opens a new category — DRM-as-rootkit. Whether this becomes a broader trend depends on industry response.

A reflection on the quieter quarter

The relatively quiet H1 has been operationally welcome. The cumulative pressure of recent years has been substantial; a quieter quarter has allowed for some recovery.

This is, in some sense, evidence that the defensive infrastructure is working. The worms that previously would have produced substantial visible incidents are being absorbed by mature filtering, fast patching, and structural defences. The operators who have invested in defensive capability are seeing returns on the investment.

The operators who have not invested are still being hit — but the press coverage of incidents is more selective; the cumulative experience of the long-tail operators is not visible in the way the major incidents are.

The defensive maturity question remains the central operational question of the field.

What I am doing operationally

Three things active in H1 that will continue.

Honeypot range expansion. From /28 to /27. The expanded address space provides more breadth for observation; the operational cost is small.

Improved analysis pipeline. The structured-log database is being upgraded to handle the increased volume more efficiently. The migration is bounded.

More Sebek tooling deployment. Two additional high-interaction hosts behind different Honeyd personas. The cumulative observation is more meaningful with multiple high-interaction targets.

What I expect in H2

Three predictions:

Continued moderate-volume worm activity. No major event of Sasser or MyDoom scale; several smaller events. Probability: 75%, deadline end of 2005.

At least one phishing-related incident with major UK retail-banking impact. Probability: 65%, deadline end of 2005.

Increased SSH brute-force traffic against my range. Probability: 80%, deadline end of 2005.

More as H2 develops.

A closing observation

The seven years of accumulated honeypot data are starting to produce some genuinely interesting longitudinal patterns. The threat-mix evolution, the volume trends, the increasing professionalisation of the threat-actor population — all are visible across the cumulative archive.

For anyone considering starting a similar observation discipline: the value compounds with duration. A honeypot run for one year produces snapshot data; a honeypot run for seven years produces trend data; the trend data is meaningfully more valuable than seven separate snapshots.

A longer view of the H1 2005 patterns

Let me extend this honeypot summary with deeper treatment of the patterns observed.

The relationship between worm-residual and new compromise

One specific pattern I have been tracking through 2005 is the relationship between worm residual traffic (scans from previously-compromised hosts continuing to propagate) and new compromise traffic (active reconnaissance for new targets).

The residual traffic from old worms is substantial. Code Red and Nimda residuals from 2001 continue. Slammer residuals from 2003 are visible. MyDoom residuals from 2004 are persistent. The cumulative residual traffic is the dominant component of inbound scan volume.

The new-compromise reconnaissance traffic is smaller in volume but more sophisticated. The reconnaissance is targeted; the patterns are deliberate; the scans are not the broad random scans of older worms.

The ratio between residual and new is approximately 80:20 by volume. The 20% new-compromise traffic is much more significant than the 80% residual traffic in terms of immediate threat.

What the new-compromise reconnaissance looks like

Four specific patterns visible in H1 2005 captures:

Vulnerability-specific scans. Scans for very specific port-and-protocol combinations that match recently-disclosed vulnerabilities. The scanner has read the advisory and is looking for matching deployments. The window between advisory and scan is now days, not weeks.

Service-version-specific scans. Scans that probe for specific service banners (specific versions of specific software). The scanner is looking for vulnerable versions specifically; non-matching versions are skipped. This is more sophisticated than older scanning.

Credential-style scans. Scans that probe authentication endpoints specifically — SSH, FTP, web admin interfaces, mail servers. The intent is credential discovery rather than vulnerability exploitation. The volume is substantial and growing.

Phishing-host reconnaissance. Scans that look for specific characteristics — fast network connections, weak authentication, accessible web servers. The intent is to identify hosts suitable for hosting phishing pages. This is a new pattern in 2005.

Each of these patterns is small in absolute volume but represents specific operational intent. The defensive responses are different per pattern.

The quiet H1 in context

The relatively quiet H1 has been useful for several reasons.

Recovery from 2003-2004 cumulative pressure. The burnout pattern I wrote about has been visible in many operators; a quieter quarter has allowed for some recovery.

Time for structural improvements. Operators who have been deferring structural improvements (segmentation, monitoring, forensic readiness) have had time to deploy them.

Calibration of defensive infrastructure. The defensive infrastructure that was tuned for 2001-2004 baseline has been somewhat over-provisioned for 2005's actual volume. Operators have been able to refine tuning without crisis pressure.

Better understanding of trends. The relative quiet has made the ongoing structural shifts (commercial-cybercrime, phishing, DDoS-for-hire) more visible against a less-noisy background.

Whether the quieter trajectory continues into 2006 is unclear. The structural conditions for major worm events remain (vulnerable populations, sophisticated tooling, economic incentive); their actual realisation depends on specific decisions by specific operators.

What this implies for H2 planning

For my own infrastructure: the H1 quiet has allowed me to complete the honeypot range expansion to /27 without competing with major-incident response work. The expansion is operational; the additional captures will accumulate over H2.

For the friends and small organisations I help: the quieter quarter has been a chance to address backlog. Several friends have completed migrations they had been deferring; one has finally deployed structured logging.

For my own writing: the H1 has produced more posts about structural and reflective topics rather than incident-specific posts. The change of mix has been welcome.

What might break the quiet

Three things that, if they happened, would substantially shift H2 from the H1 baseline:

A new Windows-targeted worm. Specifically a worm targeting a recently-disclosed Microsoft vulnerability with a large vulnerable population. The MS05-039 advisory exposed the PnP vulnerability; the resulting Zotob worm was modest, but a more capable variant could be larger.

A new mass-mailing variant with novel propagation. The Bagle/MyDoom/Netsky variants continue but at modest volume. A new variant with structurally-novel propagation could spike volume substantially.

A specific high-profile DDoS-for-hire incident. The category exists; a public incident with substantial reputational impact would shift the conversation significantly.

None of these is certain; all are plausible; the trajectory is uncertain.

More as H2 develops.

A final reflection on the H1 patterns

Let me close with broader reflection on what the H1 2005 quiet has meant.

The quieter operational tempo has been a chance to absorb the cumulative learning from 2001-2004. The pattern of those years was a series of major incidents with recovery time too short between them; H1 2005 has provided more recovery time.

This matters for sustained practice. The cumulative cost of 2001-2004 was substantial; the recovery time has allowed for some genuine resilience-building. Specific operators I correspond with have used the time for projects they could not address during the busier years.

The risk of the quiet period: complacency. Operators who experienced the busy years are unlikely to forget the lessons; operators who came into the field during the quiet may not have the same operational instincts.

For anyone earlier in their career who is reading this: the patterns from the busy years are, on the available evidence, going to recur. The structural conditions remain. The disciplines that matter — fast patching, segmentation, monitoring, forensic readiness — are not optional even in the quiet periods.

The right operational posture during quiet periods is to invest in the structural disciplines, not to relax them. The quiet ends; the disciplines persist.

More as H2 develops.

A note on the structural progression

Let me close with brief reflection on the structural progression visible in the H1 2005 patterns.

The specific shifts visible across years are:

Worm volume relative to non-worm volume. Worms were the dominant component of malicious traffic in 2001-2004; they are a smaller fraction in 2005. Other categories (credential targeting, reconnaissance, phishing infrastructure) have grown.

Sophistication of automation. Earlier automated traffic was crude; current automated traffic is sophisticated. Specific scanning patterns are well-targeted; specific exploit attempts are precisely-tuned.

Persistence of compromise. Earlier compromises produced short-lived effects (the worm passed through and left). Current compromises produce long-lived effects (the host becomes part of an ongoing infrastructure).

Economic vs. ideological motivation. Earlier attackers were largely ideological or curious; current attackers are largely economic. The shift produces predictable changes in tactics, targeting, and persistence.

The cumulative trajectory points toward continued professionalisation of the threat infrastructure. The defensive infrastructure is improving but more slowly. The gap continues to be the central operational concern.

For my own writing: more posts about the structural progression. The cumulative observation is producing patterns that individual incident write-ups miss.

A short note on H2 planning

For my own H2 planning, three specific projects:

Continue the honeypot range expansion. The /27 deployment is operational; the additional captures will accumulate; the cross-persona observation will be more meaningful with sustained operation.

Document the structural-log database upgrade. The pipeline improvements have been substantial; writing them up may be useful for other operators considering similar work.

Plan the consulting transition. The role change in November will require some operational adjustment; planning ahead reduces transition friction.

The rest of H2 will be reactive — incidents will happen and produce writing; structural shifts will continue to be visible; the cumulative reading and reflection will continue.

For anyone reading this: H2 will probably be more eventful than H1 has been. The structural conditions for major incidents remain; the timing is uncertain. The defensive disciplines continue.

More as H2 develops.

A small additional note

For anyone tracking the H1 honeypot data alongside their own observations: the patterns in this writeup are roughly consistent across the operator population I correspond with. Specific details will vary by deployment; the structural patterns are reasonably general. If your own observations diverge meaningfully from the patterns described here, that divergence is itself informative — it suggests something specific about your own threat profile.