Apache modules and what they expose

Apache 1.3, which is the version most of us run, is a modular web server. Almost every feature beyond "serve a static file" is implemented as a separate module that can be compiled in, compiled out, or loaded dynamically. The modular architecture is one of Apache's strongest features: it lets you customise heavily, and it lets the project develop new features without bloating the core.

It is also, as I have been auditing my own configuration, an enormous attack surface. Each module is code; each module is reachable through some configuration; each module is a place where bugs can live.

I want to walk through what Apache's default module set actually does, and why a serious operator should compile or load only the modules they need.

What ships in a default Apache build

A standard Apache 1.3 build with --enable-module=most includes, among others:

  • mod_access — IP-based access control.
  • mod_auth — basic password authentication.
  • mod_auth_anon — anonymous user authentication for FTP-style access.
  • mod_auth_dbm — DBM-backed password authentication.
  • mod_autoindex — automatic directory listings.
  • mod_cgi — CGI script execution.
  • mod_dir — handling of trailing-slash redirects.
  • mod_env — environment variable manipulation.
  • mod_imap — server-side image map handling.
  • mod_include — server-side includes (SSI).
  • mod_info/server-info introspection.
  • mod_log_config — logging.
  • mod_mime — content-type assignment.
  • mod_negotiation — content negotiation.
  • mod_setenvif — setting environment based on request properties.
  • mod_status/server-status introspection.
  • mod_userdir~/public_html user directories.

This is a lot of modules. For most servers, most of them are not needed.

The riskiest defaults

In priority order, the defaults that I think most operators should be reviewing and probably removing.

mod_info and mod_status expose detailed information about the server. /server-info lists every module, every directive, every running configuration value. /server-status shows the current connections, including who is talking to what. Both are protected by Allow from directives by default, but those directives are easy to mis-configure, and the consequences of a leak are substantial. I disable both modules entirely on production hosts and turn them on briefly only when I am debugging.

mod_userdir lets users serve content from ~user/public_html accessible at /~user/. This is enormously convenient on shared hosts and a serious problem on dedicated servers. Every account on the system implicitly becomes a web publisher. The threat is that an attacker who compromises any local account can publish content. I disable this on any host where users do not need it, which is most.

mod_include enables server-side includes — the <!--#exec cmd="..."--> syntax that lets HTML files run arbitrary shell commands at parse time. SSI was useful in 1996. In 1999, with proper CGI and templating systems available, it is a security vulnerability waiting for a misconfiguration. The classic mistake is enabling SSI on a directory that takes user-uploaded HTML — the user-uploaded HTML can then execute commands on the server. I disable mod_include everywhere, and where SSI is genuinely needed I limit it to specific directories with Options -IncludesNOEXEC.

mod_imap handles old-style server-side image maps. Most images today are mapped client-side. The module is largely obsolete and is one more potential vector for exploit. I have not loaded it on any new server in two years.

mod_autoindex generates directory listings when no index.html is present. This leaks the structure of the document tree to anyone who happens to land on a directory without an index. The right default is Options -Indexes everywhere, with explicit Options +Indexes on the rare directories that need it.

The configuration discipline

The principle, as ever, is least privilege. Each module should be enabled only on the directories that need it. Each directive should be scoped as narrowly as possible.

A good httpd.conf for a server I run looks something like:

ServerType standalone
User www
Group www

LoadModule access_module     libexec/mod_access.so
LoadModule log_config_module libexec/mod_log_config.so
LoadModule mime_module       libexec/mod_mime.so
LoadModule dir_module        libexec/mod_dir.so
LoadModule alias_module      libexec/mod_alias.so
LoadModule cgi_module        libexec/mod_cgi.so
# (intentionally nothing else loaded)

<Directory />
    Options None
    AllowOverride None
    Order deny,allow
    Deny from all
</Directory>

<Directory /var/www/html>
    Options FollowSymLinks
    AllowOverride None
    Order allow,deny
    Allow from all
</Directory>

<Directory /var/www/cgi-bin>
    Options ExecCGI
    AllowOverride None
    Order allow,deny
    Allow from all
</Directory>

Note the <Directory /> block at the top. Apache evaluates <Directory> blocks from least specific to most specific, with later blocks overriding earlier ones. By denying everything at the root and then explicitly permitting specific subtrees, you ensure that any path that does not match a more specific block is refused.

This is the same default-deny discipline that applies to firewalls. Apache's default is the opposite — most directories are accessible unless explicitly restricted. Inverting that requires a few minutes of configuration and is, I think, a clear improvement.

The benefit of compiling out modules

There is a difference between not loading a module and not building it.

Not loading a module — by removing the LoadModule directive — means the module's code is on disk but is not active in the running server. A future configuration change, a misconfiguration, or some clever exploitation could re-activate it.

Not building a module — by configuring Apache with --disable-module=foo at compile time — means the code is not on disk at all. Nothing can re-activate it. The attack surface is genuinely reduced.

For a serious server, building Apache yourself with a minimal module set is the right discipline. The compile takes ten minutes. The result is a smaller binary, a smaller attack surface, and a configuration that cannot be silently expanded by an automated tool that does not know what is and is not loaded.

What I am paying attention to

A few specific places where I expect future Apache vulnerabilities:

  • mod_rewrite, when used with regular expressions that incorporate untrusted input. The complexity of rewrite rules is high enough that small mistakes have surprising consequences.
  • mod_ssl, which is the third-party module that adds HTTPS. The crypto code is complex and the integration with Apache's request lifecycle is deep. I am running mod_ssl on the public-facing host but I am paying close attention to its advisory list.
  • The MIME-type handling, particularly around files where the type is determined by extension. The same file uploaded with different extensions can be treated differently, which has been the basis of past CGI exploits.

None of this is news. All of it is the kind of low-level configuration discipline that, applied consistently, makes the difference between a server that survives the next decade of probes and one that does not.


Back to all writing