Why running as root is dangerous, in concrete terms

When you start using a Unix-like system the advice is the same from everyone: do not run things as root.

The person giving the advice almost never explains why with any specificity. So you nod, agree, and continue to log in as root because it is easier and because you have not yet had the experience that actually teaches the lesson.

This post is the explanation I wish someone had given me a year ago. With concrete examples. From a system I have actually broken.

The literal blast radius

Unix's permission model is not magic. It is a very simple set of rules: every process runs as a particular user; that user has read, write, and execute rights on a particular set of files; and root, the user with id zero, has them all.

When you run a program, it inherits the rights of the user that started it. This is the entire story. There are no further protections. If you run a buggy program as root, that program — including any input it processed badly — has root.

A specific example I caused

Last weekend I was experimenting with a small CGI script in Perl. I was working in /usr/local/cgi-bin. I had a habit of testing scripts by running them at the shell, as me, then dropping them into place for Apache to serve as the www-data user.

On Sunday I made a typing mistake. I ran a script that took an argument and used it as a filename to delete. I had been using the script to clean up its own output files. I had absent-mindedly logged in as root for an unrelated reason. And the argument that I passed was wrong — empty, in fact, because I had quoted the variable wrong.

The script tried to remove a file with no name. Specifically, it ran rm -rf on a path that resolved, after my mistake, to /.

I noticed quickly. The shell got slow. Some commands started returning errors that did not make sense. I killed the script.

The damage was real. Several configuration files in /etc were gone. A handful of binaries in /usr/sbin were gone. The system was not bootable any more. I lost about two hours to recovery.

The same script run as me would have done nothing. It would have failed, with a permissions error, on the very first file it tried to delete.

The deeper version of this lesson

My own carelessness is the small version of this problem. The big version is that programs you do not own — daemons, network services, anything that takes input from the outside — make this same mistake constantly, only the input is being chosen by an attacker rather than by your own typo.

This is why the Bugtraq posts I have been reading return again and again to the same construction: "a remote attacker can cause <daemon> to execute arbitrary code with the privileges of the running process." If the running process is root, that is a serious problem. If the running process is a heavily restricted user, the same vulnerability is, in practice, much more limited.

What I have changed on my box

A few small things, the kind of things that should have been the default but were not.

Apache no longer runs as root. It runs as the nobody user, which has very few rights anywhere on the system.

My own user account does not have a password I can type easily. I cannot log in as it from the console. I have to use su deliberately, which gives me a very small psychological speed bump.

I almost never type the root password directly any more. I use su - and immediately exit the moment I am done. The shell prompt looks different — # instead of $ — and I have grown to read that prompt as a small warning.

Next post: I have been reading up on TCP wrappers and want to write up what they actually do, because I had assumed they were a firewall and they are not.