Sasser: another RPC worm · Peter Bassill

Sasser appeared on 1 May 2004 and reached operational saturation within days. The worm exploits a buffer overflow in the Windows LSASS (Local Security Authority Subsystem Service) on TCP port 445, patched in Microsoft Security Bulletin MS04-011 in mid-April. The pattern is by now familiar — vulnerability disclosed and patched, gap of weeks, worm exploits the gap.

The Sasser incident is, in many ways, a continuation of the Blaster pattern from August 2003. Same shape: Microsoft RPC-style vulnerability, similar propagation mechanism, similar operational impact. The structural lessons are also similar; the specific differences are worth writing about.

What Sasser does

The technical mechanism: Sasser exploits a buffer overflow in LSASS, the Windows component that handles local security policy and authentication. The exploit is delivered over TCP port 445, which Windows uses for SMB and various Microsoft networking services. The exploit gives the attacker code execution as the LSASS process — which runs with SYSTEM privileges, the highest privilege level in Windows.

Once a host is compromised, Sasser:

Installs itself to %WINDIR%\avserve.exe and modifies the registry to launch on boot. Standard persistence.

Begins scanning for other vulnerable hosts on port 445. The scan rate is moderate, similar to Blaster — perhaps 100 attempts per second per host.

Causes repeated reboots. When the LSASS exploit is delivered to a vulnerable host, the host's LSASS service crashes. Windows treats this as a critical error and initiates a one-minute countdown to reboot. The host reboots; comes back up; gets re-exploited; reboots again. The cycle is enormously disruptive for users.

Does not include a DDoS payload. Unlike Blaster or Code Red, Sasser does not have a scheduled secondary attack. The author appears to have been satisfied with propagation alone.

The combination of aggressive propagation and the reboot cycle means Sasser is visibly disruptive. Users notice immediately when their machines are infected; the operational impact is concentrated and dramatic rather than subtle.

The MS04-011 advisory

The LSASS vulnerability was disclosed by Microsoft on 13 April 2004. The advisory was clear about severity — remote code execution, no authentication required, default-vulnerable on most Windows versions. The patch was made available with the advisory.

The gap between disclosure and worm appearance was 18 days. This is shorter than the Blaster gap (26 days from MS03-026 to Blaster). The trend continues: each successive worm appears more quickly after its underlying advisory.

The trajectory is uncomfortable. If the gap continues to shrink, eventually we will see worms that appear before operators have meaningfully begun to patch. The defensive infrastructure will then need to be entirely pre-positioned; reactive patching will be useless.

Operator response

From my conversations and observations:

Mature organisations were patched within a week of MS04-011. The 18-day gap was sufficient for these operators to deploy the patch and avoid the worm. The investment in fast patching infrastructure, made over the past several years, is paying back.

The long tail was hit hard. Smaller organisations, home users, and the long tail of operators who patch slowly have been the substrate of Sasser's compromise population. The discipline gap between mature and immature organisations continues to widen.

The reboot cycle has been particularly disruptive. Users have been unable to use their machines productively. Help desks have been overwhelmed with calls. Some organisations have reported that the reboot cycle has caused data loss when users were unable to save documents before the forced reboot.

Network segmentation has helped where deployed. Organisations that segment their internal networks have contained the spread within affected segments. Organisations with flat internal networks have seen widespread compromise.

The pattern continues to favour mature operational practices. The mature operators absorb each incident with bounded pain; the less mature absorb each incident with substantial pain.

The author

Unusually for a worm of this scale, the author was caught quickly. Within a week, German police arrested an 18-year-old computer-science student who confessed to writing both Sasser and a related worm called Netsky. The arrest was made possible by tips from informants who knew the author's identity.

The author's motivations, as described in his subsequent statements, are familiar from earlier incidents — a combination of intellectual challenge, peer-recognition seeking, and some intent to embarrass Microsoft. The commercial-cybercrime motivation was absent; the author did not appear to be making money from the worm.

The legal consequences will play out over months. The German legal system has prosecuted similar cases before; the sentencing patterns suggest a probation-and-fine outcome rather than imprisonment. The deterrent value of the prosecution is uncertain.

For the field as a whole, the prosecution is a small reminder that worm authorship has legal consequences. The reminder will probably not deter the substantial fraction of authors who operate from jurisdictions with less aggressive enforcement, but it will deter some.

The cluster pattern continues

April and May 2004 have been busy. Witty in March, then Sasser in May, with various smaller incidents in between. The cluster pattern I described after Blaster/Welchia/Sobig.F in August 2003 continues into 2004.

The cumulative operational pressure is real. Defenders are tired; cleanup work is continuous; the next incident appears before the previous one is fully resolved.

For the operators I help, the past 8 weeks have been particularly difficult. Three different organisations have called for help with Sasser-related incidents; one of them was still cleaning up MyDoom-related work from January when Sasser hit. The cumulative time investment has been substantial.

The structural pattern

Looking at the sequence of major Microsoft worms over the past three years:

Code Red and Code Red II (July-August 2001): IIS .ida vulnerability.
Nimda (September 2001): Multiple Microsoft vulnerabilities.
SQL Slammer (January 2003): SQL Server resolution service.
Blaster (August 2003): RPC DCOM.
Sasser (May 2004): LSASS.

Five major Microsoft-targeted worms in three years. Each exploited a vulnerability that had been patched weeks before. Each produced operationally severe disruption to the unpatched population.

The pattern is structural. The Microsoft codebase has buffer-overflow vulnerabilities at depth; some are found by Microsoft and patched; some are found by attackers and exploited. The patching cycle is not fast enough to prevent the exploitation window from being usable.

The Trustworthy Computing initiative Microsoft began in early 2002 was supposed to address this. The Sasser incident is, in some sense, evidence about how much it has and has not done.

What Trustworthy Computing has clearly done: improved the patching cadence, improved the advisory quality, improved the customer communication. The MS04-011 advisory was clear and timely; the patch was available immediately; the post-incident communication has been good.

What Trustworthy Computing has not yet done: meaningfully reduced the rate of new buffer-overflow vulnerabilities in Microsoft products. The LSASS vulnerability is structurally identical to many earlier ones; the codebase had the bug; the bug was found.

My probability estimate that Trustworthy Computing produces measurable reduction in new vulnerabilities over the next 2-3 years remains around 70%. Slower than I would like; faster than no improvement at all.

What operators should do

For anyone running Windows, the immediate response is clear:

Apply MS04-011 immediately. The patch is widely available; the cost of applying it is small; the cost of not applying it is severe.

Block port 445 at network perimeters. No legitimate service should be exposed on this port across an internet boundary. Filtering eliminates the external exposure entirely.

Block port 445 on internal segments where possible. The lateral-spread problem requires internal segmentation. A compromised laptop should not be able to reach LSASS on the file server.

Disable the messenger service. Although Sasser does not specifically target messenger, the messenger service is on by default and listens on port 445. Disabling it reduces the attack surface.

For cleanup of compromised hosts:

Stop the worm process before it triggers another reboot. The Sasser worm is avserve.exe; killing it stops further propagation and reboot cycles. From the Run dialog: taskkill /f /im avserve.exe.

Apply MS04-011 immediately after killing the worm. Without the patch, reinfection happens within seconds.

Remove the worm files and registry persistence. Standard malware cleanup.

Verify host integrity. Sasser's footprint is small but the cleanup verification is the same as for any compromise.

What this teaches

Four generalisations.

The worm-after-patch trajectory continues. Each new major Microsoft vulnerability is now reliably exploited within weeks. Operators who patch within days are safe; operators who patch within months are not.

Internal segmentation continues to be undervalued. Sasser, like Blaster, spreads aggressively inside organisations. Internal segmentation is the structural defence; it remains underdeployed.

The reboot cycle is a category of operational disruption. Sasser is the second major worm to produce repeated reboots (the LSASS crash being the trigger). The reboot pattern is more disruptive to users than typical malware. Future worms may be designed to maximise this disruption rather than to minimise it.

The cluster pattern is sustained. The cumulative pressure of multiple incidents in compressed timeframes is becoming the new operational reality. Defensive infrastructure must be sized for sustained pressure, not just isolated incidents.

What I am doing personally

For my own infrastructure: my hosts are not Windows; the direct exposure is zero. My Snort sensor is logging the substantial scan traffic; the rate is high but bounded.

For friends and small organisations: helping with patching, cleanup, and incident response. Three calls this week alone.

For my structured-log analysis: the Sasser dataset is now being processed alongside the earlier worm datasets. The cumulative pattern is informative; the time-series of major incidents over the past three years is starting to be useful as a reference.

For my predictions calibration: Sasser resolves several predictions affirmatively. The pattern of "worm after every Microsoft critical-vulnerability advisory" is becoming the default expectation; my probabilities should reflect that.

The career-transition perspective

I moved roles a year ago to a new employer where production-availability matters in ways my previous role did not. The Sasser incident has been the largest single test of my new responsibilities to date.

The specific learnings from inside an organisation handling this kind of incident at production scale:

The patching question is organisationally complex. Deploying MS04-011 across hundreds of desktop machines and dozens of servers within a week required coordination across multiple teams. The technical work was small; the organisational work was substantial.

The communication question is critical. Users need to know what is happening, what to expect, what to do, and when normal service will resume. Communication that is too sparse produces anxiety; communication that is too detailed produces noise. Finding the right level requires practice.

The post-incident review is where the lessons live. The technical response to Sasser is finished; the post-incident review is starting now. The questions — what worked, what did not, what should change — are where the structural improvements come from.

The discipline of writing things down extends. I have been writing this notebook for over five years; the discipline has carried over to my professional work. Internal incident reports, post-mortems, lesson-summaries — all benefit from the same writing discipline as the public blog. The skill transfers.

For my own writing going forward: more posts on the operational dimension. The lessons from inside an organisation handling these incidents at production scale are, on the available evidence, what readers most want.

Where this leaves the field

Three thoughts.

The defensive infrastructure has matured but the threat infrastructure has matured faster. Each year produces better defensive tools, better patching processes, better filtering, better awareness. Each year also produces worse threats — faster propagation, more sophisticated tooling, more economic motivation. The asymmetry continues to favour the threat side.

The operator population continues to differentiate. Mature organisations are increasingly distinct from immature ones in their ability to absorb incidents. The differentiation will eventually have economic consequences — the long tail of less-mature operators will face increasing pressure from insurers, customers, and regulators.

The structural answers require multi-year investment. Defence in depth, internal segmentation, automated patching, behavioural detection, mature incident response — none of these can be deployed in days or even months. The organisations that invest now will be better positioned in 2-3 years; the ones that do not will continue to be hit hard by each new incident.

More as the year develops. The next post will probably be about the Cabir mobile virus, which is a new category that deserves separate treatment.

A closing note

Writing this on Friday evening at the end of a long fortnight of Sasser-related work. The work is finishing; the structural lessons will play out over the next several years.

For anyone still in cleanup: the work is sustainable; the discipline matters; the cumulative experience is valuable. Each successive incident produces fewer surprises and faster resolution.

For anyone reading this post in the future: the specifics will be different but the patterns will be similar. The structural lessons — patch quickly, segment internally, respond systematically, learn from each incident — apply to whatever comes next.

The deeper question of LSASS

A reflection that fits here on the structural question of why LSASS specifically. The Local Security Authority Subsystem Service is the Windows component that enforces local security policy — authentication, password management, audit logging, the various group-policy enforcement decisions. It is, in some sense, the most-trusted Windows process from the perspective of system security.

A buffer overflow in LSASS is therefore particularly concerning. The exploit gives the attacker SYSTEM-level privileges via the very component that is supposed to be the security boundary. The attacker's code execution happens in the same process that is responsible for deciding what code execution is allowed. The structural irony is substantial.

The deeper concern: if the security-enforcing components have buffer-overflow vulnerabilities, then the rest of the security architecture rests on a foundation that has known classes of failure. The whole edifice of Windows security — user accounts, group policies, audit logs — depends on processes that can themselves be compromised by remote attackers. The trust chain has a structural weakness.

This is not unique to Microsoft. Linux's kernel has had similar vulnerabilities. OpenBSD, the security-focused BSD variant, has had fewer but not zero. The structural problem is general to operating systems written in C with monolithic privilege models.

The long-term answer is fine-grained privilege separation — splitting the security-enforcement components into smaller pieces that each hold the minimum privileges they need, with explicit trust boundaries between them. OpenSSH demonstrated the principle in 2002. Microsoft has been talking about applying similar principles to Windows since Trustworthy Computing began. The shipping reality is years away.

A small operational note

For anyone managing infrastructure that includes Windows servers running services on port 445 (file sharing, print sharing, Active Directory): this is a good moment to audit which servers are exposed to which network segments. The default posture in many networks is that any internal host can reach port 445 on any server; this is too generous in 2004. Limiting port-445 reachability to specific source networks reduces the lateral-spread surface dramatically.

The audit takes an afternoon for a typical mid-sized network. The findings are usually surprising. The remediation is bounded. The benefit, when the next worm hits, is substantial.