Bagle and MyDoom: a new mass-mailing scale

Two mass-mailing worms in two weeks. Bagle appeared on 18 January 2004; MyDoom on 26 January. By the end of January, MyDoom had become — by a substantial margin — the largest mass-mailing event in internet history. By peak it was generating an estimated one in twelve emails globally. The previous record holder, Sobig.F at one in seventeen, has been substantially exceeded.

This is going to be a longer post than my recent ones. The incident is large enough that the careful walk-through is justified, and the structural lessons are large enough that they deserve their own framing.

Bagle, briefly

Let me cover Bagle first, since it set the stage. Bagle appeared on 18 January, propagating through email with an executable attachment. The mechanism is the standard mass-mailing-worm pattern I have written about for years — the Melissa lineage, the ILOVEYOU template, the Klez auto-execution refinement. Bagle adds two notable features: it includes a backdoor on TCP port 6777, and it phones home to a controller for instructions.

Bagle's volume was substantial but not unprecedented. By the time MyDoom appeared a week later, Bagle was already a well-understood category-of-the-week. The worm was being filtered effectively by mature mail relays; the antivirus signatures had landed; the cleanup was proceeding.

Then MyDoom appeared.

What MyDoom does

MyDoom's core mechanisms:

Mass-mailing through Outlook, with the worm in the attachment. The attachment names follow a convention designed to look ordinary — document.zip, report.txt.exe, and similar. The body of the email mimics a mail-delivery error message, which prompts the recipient to open the attachment to see what was supposedly bounced.

A backdoor on TCP port 3127. Once a host is compromised, the backdoor listens for commands from anyone who knows it is there. The backdoor is not authenticated; anyone with the IP address of a compromised host can issue commands.

Spreading through KaZaA file-sharing. The worm copies itself into the KaZaA shared folder under names that look like popular pirated software (WinAmp 5.0 (new).exe, RootkitXP.exe, similar). KaZaA users searching for these terms find and download the worm.

Address-book harvesting and document-scanning. Like SirCam, MyDoom scans documents on the host for email addresses to add to its propagation list. The contact graph is much larger than just the user's explicit address book.

Two scheduled DDoS attacks. MyDoom variants are scheduled to launch coordinated DDoS attacks against SCO Group (whose lawsuit against IBM over Linux had been controversial in the open-source community) and against Microsoft. The DDoS phase begins on specific dates after the initial outbreak.

The combination is operationally substantial. Each compromised host is a propagation source via four distinct mechanisms (email, KaZaA, the backdoor for remote command, and any subsequent attacker-driven activity through the backdoor). The compromised population becomes a strategic asset for whichever party first issues commands through the backdoors.

The volume

The scale is the part that surprised me. By peak, MyDoom-pattern mail was approximately 8% of all email globally. To put that in context, that means roughly one in twelve messages traversing the internet was MyDoom propagation.

For a sense of the magnitude: the global email volume in 2004 is roughly 30 billion messages per day. Eight per cent of that is 2.4 billion MyDoom-pattern messages per day. The worm was generating, at peak, more than 100,000 messages per second.

The mail relay I run for friends saw the volume directly. Daily inbound mail volume, which is normally 200-400 messages per day, peaked at over 5,000 messages during the worst of MyDoom — roughly 12-25 times the baseline. The aggressive filtering disciplines I deployed after ILOVEYOU caught essentially all of the MyDoom messages at the gateway; the recipients saw very little.

For operators without similar filtering, the operational impact was severe. Many corporate mail systems were simply unable to keep up with the inbound volume. Mail queues backed up; storage filled; legitimate mail was delayed by hours.

The DDoS phase

MyDoom variants schedule different DDoS attacks. The first variant, MyDoom.A, schedules a DDoS against SCO Group's website starting on 1 February. The DDoS uses the compromised hosts collectively — each infected host generates HTTP requests to the SCO website at a steady rate.

From the available reporting, the SCO website has been intermittently unavailable since the DDoS began. SCO has used various mitigations — moving the DNS to a different IP, deploying upstream filtering — but the worm's hardcoded target list does not adapt to these changes. Some variants will continue to send traffic to whichever IP the DNS pointed to when they were created; other variants resolve the DNS dynamically and chase the moving target.

The second main variant, MyDoom.B, schedules a DDoS against Microsoft's update services. This one has been more successfully mitigated by Microsoft, who have substantial infrastructure for absorbing DDoS traffic.

The DDoS phase is, in some sense, the intent of the worm. The propagation phase builds the attacker's capability; the DDoS phase exercises it. The author's choice of SCO as a target signals the political-economic motivation — SCO's lawsuit against IBM over Linux IP claims has been deeply unpopular in the open-source community. A worm that deliberately attacks SCO is ideologically motivated more than commercially motivated. This is not the commercial-cybercrime pattern I have been observing recently; it is a return to the older ideological-grievance pattern.

The backdoor and what happens to it

The TCP-3127 backdoor is the structural innovation worth dwelling on. Each compromised host listens on this port for incoming connections. The backdoor accepts unauthenticated commands and acts on them.

Within days of MyDoom's emergence, secondary worms began scanning specifically for the backdoor and using it. Doomjuice appeared in early February, scanning random IP addresses for hosts listening on port 3127, and using the backdoor to install itself.

This is the chain-compromise pattern at scale. MyDoom's compromised population is now substrate for follow-on attacks. Doomjuice is one such follow-on; there will be others.

The deeper observation: each new worm with persistence creates a substrate that subsequent attackers exploit. The compromised-host population is durable across worm-incidents in a way that earlier, non-persistent worms did not produce. The cumulative compromised pool is growing year-on-year.

The Bagle versus Netsky versus MyDoom war

The most novel feature of the early-2004 worm landscape is the war between competing worms. Three major worm families — Bagle, MyDoom, and the new Netsky — are now actively fighting each other for dominance of the compromised-host population.

Netsky appeared on 18 February and includes code to remove MyDoom and Bagle from infected hosts. The Netsky author appears to view Bagle and MyDoom as competition, presumably for the same monetisation opportunities.

Bagle has responded with new variants that detect and disable Netsky.

MyDoom variants have been observed with code that targets Netsky specifically.

Messages embedded in the various worms include direct insults aimed at competing authors. The competition is overt.

This is a category change. The compromised-host pool is now a contested resource among different malware operators. The economic infrastructure of cybercrime is operating openly. Compromised hosts are valuable; the value is large enough to justify direct combat between different operators for control.

Why the volume

The scale of MyDoom's volume — substantially greater than any previous mass-mailer — deserves examination.

A few specific factors made this possible.

First, the document-scanning improvement. Previous worms harvested email addresses primarily from the user's Outlook address book. MyDoom scans through documents on the host's filesystem looking for email-address patterns. A typical desktop has thousands of email addresses scattered across emails, web caches, documents, and software configurations. MyDoom finds and uses many more addresses than the address book alone.

Second, the social-engineering quality. The mail-delivery-error template is more credible than earlier templates (ILOVEYOU was an obvious tempt; AnnaKournikova was specifically interesting; MyDoom looks like ordinary system noise). Recipients are more likely to open the attachment.

Third, the multi-vector propagation. Email plus KaZaA plus the backdoor means a single infection produces multiple propagation events. Each compromised host's contribution to the worm's growth is larger than for a single-vector worm.

Fourth, the timing. MyDoom appeared during a period when mail-borne malware was at a moderate but not extreme level. The defensive infrastructure was tuned for the previous baseline; the spike to 8% of global mail overwhelmed many filters that were sufficient for 2-3% volumes.

The combination of these produced the extraordinary volume.

The defensive lessons

From my own observation and from operator conversations, several lessons.

Mature mail-relay filtering caught the volume. Operators with filtering deployed after ILOVEYOU and refined through Klez had no significant problem. The disciplines were in place; the disciplines worked. This is encouraging — the cumulative defensive investment has produced infrastructure that can handle 10x volume spikes without operator intervention.

Operators without that filtering were overwhelmed. The long tail of operators who had not deployed serious mail filtering experienced substantial disruption. This is the residual problem — the cumulative defensive investment is concentrated at the operators who have invested; the rest are still exposed.

The KaZaA vector reveals new defensive needs. File-sharing networks are now part of the threat model in a way they were not before. Defending against MyDoom-via-KaZaA requires either blocking KaZaA (which many operators do not, because users will route around it) or scanning files retrieved through KaZaA before they execute (which most antivirus does, but not all).

The backdoor-as-substrate problem requires cleanup discipline. Hosts compromised by MyDoom and not properly cleaned up are vulnerable to Doomjuice and any subsequent worm that exploits the same backdoor. The cleanup question is now multi-layer: not just "is the original worm gone?" but "are all the secondary infections through the backdoor also gone?".

Address-book exposure is a structural concern. The document-scanning approach to address harvesting is going to become standard. Every email address that has been written down somewhere on a compromised host is a potential propagation target. The privacy implications are substantial; address-book contents are no longer the only at-risk data.

What operators should do

For mail relays, the standard advice intensifies:

Strip executable attachments by default. The list of dangerous extensions is now extensive but well-known. Apply it.

Apply current antivirus signatures. Multiple AV vendors have shipped MyDoom signatures within hours. Update mechanisms must be aggressive.

Watch outbound mail volume per source. A compromised internal host generates a characteristic surge of outbound mail. Detection at the relay level catches this even when host-level antivirus has missed the infection.

Block port 3127 at perimeters. No legitimate service uses this port. Blocking it eliminates the backdoor exposure, even on already-compromised hosts.

For desktop infrastructure:

Patch Outlook with current security updates. The auto-execution vector that Klez exploited has been mostly fixed in current versions; older versions remain vulnerable.

Disable file-sharing software where possible. KaZaA and similar are not corporate tools; their use on corporate networks should be policy-managed.

Educate users about the mail-delivery-error template. This template is going to be reused by future worms; users who recognise it as a phishing pattern are less likely to open the attachment.

For cleanup of compromised hosts:

Full reinstall is the safe default. Cleanup-in-place can miss secondary infections through the backdoor. Reinstall from clean media, restore data from known-good backups, apply current patches before reconnecting to the network.

At minimum, scan for and remove the backdoor. Even if cleanup-in-place is chosen, the port-3127 backdoor must be confirmed-removed. Testing with a fresh connection from outside is a useful verification.

What this teaches structurally

Three generalisations.

The mass-mailing category continues to scale. Each successive worm in this category exceeds the previous one's volume. The trajectory has not yet stopped. Future worms in the same family will exceed MyDoom; the structural defences need to handle volumes that current peaks suggest.

Worm-on-worm conflict is now real. The Bagle/MyDoom/Netsky war is the first publicly visible instance. The conflict is over the compromised-host population as a strategic resource. Future worm authors will assume that any compromise produces a host that may be contested.

The economic motivation is increasingly explicit. MyDoom's DDoS targets reveal intent — the worm exists to attack specific entities. Earlier worms had been more about capability than intent. The shift toward intent means that worms are now operational tools for specific objectives, not just self-propagating curiosities.

The personal and operational dimensions

From my own perspective, the MyDoom incident has been the busiest mail-relay week of the four years I have been running it. The filtering caught everything; the cleanup was zero; the operational cost was small. The investment in the filtering architecture post-ILOVEYOU has paid back substantially during this incident.

For friends and small organisations I help, the calls have been more involved. Three different organisations have called this week with MyDoom-related issues; two were handled with patching and signature updates; one required full reinstall of three desktop machines. The cumulative time investment has been around 20 hours of my evenings and weekends.

For the broader security community, the MyDoom incident is producing a wave of operational discussions. The mailing-list traffic is high; the conference circuit will be dominated by retrospectives in the coming months. The structural changes I have been advocating — better defaults, smarter filtering, more rigorous cleanup — are getting more attention than usual.

For my predictions calibration, MyDoom resolves prediction 1 affirmatively. The 80% probability was right; the timing was earlier than my central estimate.

What I expect over the rest of 2004

A few specific predictions, marked with probabilities:

The Bagle/MyDoom/Netsky war continues for at least three more months. 85% probability. The economic motivation is large; the authors are productive; the conflict is structural.

At least one more mass-mailing variant exceeds 5% of global mail. 75% probability. The pattern of each new variant exceeding the previous suggests the trajectory has not yet plateaued.

A worm using KaZaA-style propagation as its primary vector. 60% probability. MyDoom uses it as a secondary vector; primary use is the natural progression.

A worm specifically targeting the MyDoom backdoor with destructive payload. 55% probability. The Witty pattern of destructive payloads may combine with the chain-compromise opportunity.

More as the year develops. Writing this on Friday evening at the end of a long week; the next post will probably be off-cadence depending on whether anything new appears.

A closing reflection

The MyDoom incident is, on the available evidence, the most operationally significant single event of 2004 to date. The volume is the headline; the structural innovations (chain-compromise via backdoor, worm-on-worm war, document-scanning address harvesting) are the longer-term concerns.

For the operators who absorbed the impact through prepared infrastructure, the cost was modest. For those without, the cost was substantial. The gap between prepared and unprepared operators is widening; the defensive infrastructure is uneven across the operator population.

The specific worm will eventually be defeated by signature distribution, patch deployment, and cleanup. The structural lessons — the speed of evolution, the economic motivation, the multi-vector propagation — are with us for years.

A note on what this means for the coming year

The category-defining feature of the early-2004 worm landscape is velocity of variation. Each new variant of MyDoom and its competitors appears within days of the previous one. The signature-update cycle, even when working as designed, lags variant production by hours. This is the signature-detection arms race intensifying; the defensive response has to shift toward behaviour-based detection that can catch new variants without specific signatures.

I have been writing about behavioural detection for some time without much practical implementation. The MyDoom landscape may be the forcing function that pushes operators (myself included) into more serious deployment. Outbound mail volume per source, unusual file-system modifications, characteristic process-launch patterns — all of these are catchable without specific signatures. The infrastructure is there; the operational discipline of using it is uneven.

The next several months will probably see this maturation accelerate. By the end of 2004, I expect the operator population that runs serious behavioural detection will have grown noticeably. The signature-only operators will continue to be hit harder by each successive variant; the behaviour-aware operators will absorb the variants without specific incident response. The defensive gap is widening; the question is which side of the gap the operator population settles on over time.

For my own setup, the next concrete step is to expand my structured-log analysis to catch worm-like behaviour patterns regardless of which specific worm produces them. That is a project for the next month or two; the writeup will follow when the deployment is operational.