Spring rule-writing review

Last September I sat down and rewrote a substantial portion of my Snort rules using the preprocessor approach I had been reading about. Six months on, time for a structured review. What is firing usefully? What is producing noise? What did I get wrong on the rewrite, and what would I now change?

This post is the audit, written for my own future use.

The numbers

From the structured logs of the last 90 days:

Total alerts: 14,283
Unique sources: 4,127
Unique destinations: 12 (I have 14 hosts; two get most of the attention)
Distinct rule IDs that fired at least once: 67 of 142 in my rule set (47%)
Rules that produced more than 100 alerts each: 12

The 12 most-firing rules account for about 85% of the alert volume. This is the standard Pareto distribution for any kind of detection rule. The interesting question is whether the 12 are firing on real signal or on noise.

The top firing rules — useful or not

I walked the top 12 in order of alert volume and made a judgement call for each.

1. Generic CGI access on port 80 (3,847 alerts). Mostly noise. This rule was a holdover from the early days and is too broad. Most fires are scanner traffic that has already been categorised by other, more specific rules. Action: lower the rule priority below the specific-CGI rules so the specific ones fire first. Rule itself stays as a fallback.

2. SYN scans from $EXTERNAL_NET (2,103 alerts). Useful but high-volume. Real signal — these are real port scans. The volume is just what the modern internet looks like. Action: keep, but aggregate alerts per source-IP-per-hour to reduce log volume.

3. PHF probe attempts (1,219 alerts). Useful. Old vulnerability, automated scanners still try it. Each alert is a real exploitation attempt. Action: keep as-is.

4. ICMP echo from external (892 alerts). Mostly noise. Most fires are traceroutes from legitimate sources. Active monitoring, vendor sweep, normal internet traffic. Action: rule was originally added during a specific incident; deprecate.

5. Outbound DNS to non-$DNS_SERVERS (754 alerts). Useful, surprisingly. This catches my own hosts trying to reach DNS servers other than the ones I configured. Most fires are due to misconfiguration on my own side; a couple have been more interesting (a friend's laptop with a DHCP-set DNS that was not what we expected). Action: keep.

6. SSH brute-force pattern (612 alerts). Useful. Real signal. Sources are typically compromised hosts trying common passwords. Worth knowing about. Action: keep, increase severity.

7. CGI directory traversal attempts (487 alerts). Useful. See my earlier post on this. Each fire is a probe for a specific known vulnerability. Action: keep.

8. NetBIOS scanning (445 alerts). Useful. Sweeps for Windows file-sharing services. I do not run those services; the alerts are early warning that the source IP is scanning more broadly. Action: keep.

9. Malformed HTTP requests (398 alerts). Mostly noise. Lots of broken clients out there. The underlying signal — actual exploitation attempts — is buried. Action: split into specific malformed-request types; deprecate the general rule.

10. Inbound traffic to closed ports (287 alerts). Useful but redundant. Already caught by the firewall and logged there. Snort is a second observer. Action: keep but acknowledge it duplicates other data.

11. ICMP from RFC 1918 source (245 alerts). Useful. Source-spoofed traffic claiming to be from private addresses. Either a misconfiguration upstream or an attack technique. Rare enough to be worth investigating each time. Action: keep, increase severity.

12. Wu-FTPD specific exploit pattern (198 alerts). Useful. Even though I no longer run wu-ftpd, the pattern still hits my honeypot regularly. Each fire is data about scanner activity. Action: keep.

The rules that never fired

The more interesting question is the 75 rules that fired zero times in 90 days. A few categories:

Specific exploit patterns for software I do not run. Fine — these are aspirational, ready in case the situation changes. No action.

Patterns from advisories that are now too old to see in the wild. A few rules for vulnerabilities from 1997 and earlier are not seeing anything because the underlying scanners have moved on. Candidates for deprecation if the audit cost is real, but they cost nothing to keep.

Patterns that are too narrowly written. Several rules are written so specifically that any small variant misses them. I think a few of these are too narrow — the real attack pattern in the wild is a slight variant. Action: review each, broaden where the underlying technique is generic enough that broadening does not produce false positives.

Patterns that overlap with other rules. A handful of rules duplicate coverage with other rules; the other rule fires first and the second never gets a look-in. Action: deprecate the redundant ones.

What this exercise has taught me

Three things, in increasing order of generality.

The signal-to-noise ratio matters more than the alert volume. A rule that produces 1000 alerts of which 999 are useful is fine. A rule that produces 100 alerts of which 50 are useful is a problem. The discipline of asking "what fraction of these are real?" for each top firing rule is what distinguishes maintained rule sets from accumulated ones.

Aggregation is a separate skill from rule writing. A rule that fires once per scan packet produces hundreds of alerts for a single scan. The same rule with per-source-IP aggregation produces one alert per scan. The rule is the same; the aggregation logic is different. Most published rule sets do not include aggregation guidance, which is a gap.

The cost of zero-fire rules is small but the audit cost is real. Rules that never fire do not cost detection cycles in any meaningful sense. They do cost cognitive cycles when the rule set is being audited. The discipline I am moving toward is to mark rules that have not fired in 6 months as "watch only" — present, not deleted, but understood as low-priority.

A general structural change I am making

For the next iteration of the rule set, I am separating rules into three categories with different review cadences:

Active rules (fire regularly, treated as production detection): reviewed every quarter. Each one's signal-to-noise is examined.

Reference rules (fire occasionally, kept for completeness): reviewed every six months. Audited for staleness.

Aspirational rules (fire never, ready in case): reviewed annually. Mostly deletion candidates unless there is a clear reason to keep them.

This is structural, not a one-time exercise. The whole point of a rule set is that it evolves with the threat landscape. A rule set that has not changed in two years is, almost certainly, a rule set that no longer matches the threat landscape.

What the alert log tells me about the threat landscape

A few observations from looking at the 14,283 alerts in aggregate:

The base rate of opportunistic attack traffic has roughly doubled since I started measuring. In late 1998 my Snort sensor produced about 30-40 alerts per day. Now it produces 150-200. The internet has more attackers, more compromised hosts being used as scan sources, and more sophisticated automation.

The geographic distribution of sources has shifted. Eastern European and Chinese cable-modem ranges are now the largest source category by volume. North American cable was the dominant source two years ago. This roughly tracks the global growth of always-on residential connections.

The mix of attack types has shifted toward Windows-targeted exploits. Even though my hosts are all Linux, the scans hitting me are increasingly looking for Windows-specific vulnerabilities (NetBIOS, Microsoft RPC, IIS). This is consistent with attackers building Windows-focused toolkits because that is where the vulnerable population lives.

For the next quarter's writing, I am going to focus on a few of the more interesting individual alerts I have caught — the ones that did not fit any standard pattern. They are usually where the new techniques first appear.