Writing my first Snort preprocessor

When I wrote about Snort 1.7 in June, I committed to writing a custom preprocessor as a learning exercise. A weekend at the keyboard has produced a working one. The exercise has clarified more about Snort's internal architecture than any amount of reading.

This post is the writeup. The code itself is small; the lessons in writing it are larger than the code.

The use case

My specific need: I have a small set of internal services that should never be touched from outside my network. The services include the management interfaces of my router, my honeypot's monitoring port, and a few other administrative endpoints.

A standard Snort rule can detect traffic to these. Several rules do already. The cumulative ruleset, however, is awkward — one rule per service per protocol, with each rule doing essentially the same logic with different specifics.

A preprocessor consolidates this. The preprocessor maintains the list of "never touch from outside" services and inspects every packet with one piece of code. New services to protect are added to the configuration, not to the rule set. The whole concern lives in one module.

The Snort plugin interface

Snort 1.7 exposes a plugin interface for preprocessors that, while not formally documented as stable, is consistent enough across versions to be usable. The relevant pieces:

// Called once at startup
void InitMyPreprocessor(struct _SnortConfig *sc, char *args);

// Called for every packet
void CallMyPreprocessor(Packet *p, void *context);

// Called at shutdown
void CleanupMyPreprocessor(int signal, void *data);

// Registration (called from the main module init)
void RegisterMyPreprocessor(struct _SnortConfig *sc) {
    PreprocessFunctionNode *node = AddFuncToPreprocList(
        sc, CallMyPreprocessor, PRIORITY_DETECT, 17000, PROTO_BIT_TCP);
}

The priority value determines where in the preprocessor chain my code runs. The protocol bits restrict which packets it sees. The function pointer is the per-packet handler.

The design is simple. The plumbing to register a preprocessor is small. The hard work is the logic of the preprocessor itself.

The configuration parser

Snort preprocessors take configuration via a string from the user's snort.conf:

preprocessor restricted_services: 192.0.2.0/24 -> 10.0.1.5:8443, 10.0.1.6:22

The preprocessor's Init function gets this string and parses it. The parsing is, in my case:

void InitMyPreprocessor(struct _SnortConfig *sc, char *args) {
    char *home = strtok(args, " ");      // 192.0.2.0/24
    char *arrow = strtok(NULL, " ");     // ->
    char *services = strtok(NULL, "");   // 10.0.1.5:8443, 10.0.1.6:22

    parse_cidr(home, &my_home_net);
    parse_service_list(services, &my_protected_services);
}

The parse_cidr and parse_service_list functions are small helpers I wrote. The Snort source has utilities for these (SnortAlloc, parseIP, etc) but using them mostly added complexity. My own helpers were fewer lines and easier to test.

Lesson learned: do not use the Snort utilities reflexively. They are useful when they fit; they are needlessly complex for simple cases. Write the helper.

The per-packet handler

This is where the actual work happens:

void CallMyPreprocessor(Packet *p, void *context) {
    if (!p || !p->iph || !p->tcph) return;

    uint32_t src = ntohl(p->iph->ip_src.s_addr);
    uint32_t dst = ntohl(p->iph->ip_dst.s_addr);
    uint16_t dport = ntohs(p->tcph->th_dport);

    // Only care about packets from outside our home network
    if (in_cidr(src, &my_home_net)) return;

    // Check if the destination matches a protected service
    if (is_protected_service(dst, dport, &my_protected_services)) {
        SnortEventqAdd(
            GENERATOR_RESTRICTED_SERVICES,  // generator id (custom)
            1,                                // sid (custom)
            1,                                // revision
            CLASSIFICATION_ATTEMPTED_RECON,
            3,                                // priority
            "External access to protected service",
            NULL
        );
    }
}

SnortEventqAdd is the function that adds an alert to Snort's output queue. The alert flows through the standard output plugins (logfile, syslog, database) just as a rule-generated alert would.

The handler runs for every TCP packet. The early-out for packets from the home network keeps the cost low — most traffic is irrelevant and is rejected in two comparisons.

Building and loading

Snort preprocessors are compiled into the Snort binary at build time, not loaded as separate modules in 1.7. (The plugin model in some modern systems supports dlopen-style loading; Snort 1.7 does not.)

The build is:

Drop the source into the Snort source tree (src/preprocessors/).
Add the file to the relevant Makefile.am.
Run autoreconf, configure, build.
Verify the new preprocessor is registered at startup.

The whole cycle takes a few minutes once the harness is set up.

What this exercise has taught me

Four things, in increasing order of generality.

Snort's internal architecture is sound. I had read the source previously; writing a preprocessor that integrates with it has confirmed that the design is well thought out. The interfaces are clean, the responsibilities are clearly partitioned, the failure modes are predictable.

Writing C in the real world is harder than reading C in textbooks. The exercise was, in part, an exercise in C plumbing — buffer management, pointer-checking, error handling. The textbook examples gloss over these. The real code requires that they be done correctly. Several iterations of my preprocessor crashed Snort under specific edge cases until I tightened the input validation.

The Snort utility library is uneven. Some parts are well-designed and worth using. Some parts are awkward and worth replacing with smaller bespoke helpers. The discipline is to use the right tool for each subtask, not to reflexively use the framework's utilities for everything.

Custom detection logic deserves to live in preprocessors, not rules, when the logic is complex. A rule is for pattern matching. A preprocessor is for stateful, multi-condition logic that does not fit cleanly in a rule. My "protected services" use case was a clear preprocessor candidate; trying to express it as a set of rules was awkward.

What I am going to do with this

The preprocessor is now in production on my Snort sensor. It has caught two events in the past week — both legitimate, my own monitoring tools touching the protected services from a host I had not added to the home network. Configuration fixed. The preprocessor is doing exactly what I designed it to do.

More broadly, the exercise has given me confidence to write a few more preprocessors for specific operational concerns. Two on my list:

A connection-rate preprocessor that maintains a per-source counter of connection attempts and alerts when any source exceeds a threshold over a configurable window. This is essentially a rate-limit detector with smarter aggregation than the standard portscan2 preprocessor provides.
A protocol-conformance preprocessor that examines specific application-layer protocols for malformed sequences. I have particular concern about HTTP malformed-request patterns; a preprocessor focused on HTTP conformance would catch many subtle attack variants.

Neither is urgent; both are interesting projects for slow weeks.

A closing reflection

Writing code, as a learning exercise, is more valuable than reading it. I have known this in principle for years. The Snort exercise has made it concrete. There is something about the discipline of making the system do something that exposes assumptions reading alone does not.

For anyone deploying Snort at any scale: write at least one custom preprocessor or output plugin. The exercise will pay back many times the cost. The Snort codebase is friendly enough that the bar to entry is low.

For anyone considering doing the same exercise on a different open-source security tool — Nessus, the various honeynet tools coming from the Honeynet Project, the netfilter helper modules — the same advice applies. Pick a tool you use seriously. Pick a small extension. Make it work. The understanding compounds.

More as 2000 develops. The next post is going to be about reading OpenSSH source, which I have been postponing all year and need to actually do.