John the Ripper: cracking my own password hashes

John the Ripper is a password cracker. It takes a file of password hashes and tries to recover the passwords by guessing — first against common dictionaries, then by mutating those, then by exhaustive search.

I ran John against my own machine's /etc/shadow last week. This is a small machine, with about a dozen accounts — me, two old test accounts, and a handful of friends I have given shells to over the years. I had the (unsubstantiated) belief that the passwords on this box were strong.

Within two hours, John had recovered three of the eleven passwords. By the next morning it had recovered six. The exercise was, to use the term I committed to, humbling.

What John actually does

John the Ripper is, mechanically, a guessing engine that operates against a hash function rather than against a live login system. The advantage is speed — the hash function runs at compute speed, with no network round-trip — and the lack of any rate limiting.

For each candidate password, John computes the hash, compares it against the hashes in the file, and moves on. Modern hardware can do millions of these per second.

The candidates come from several sources:

Single-mode: derived from the username and the GECOS field. If your username is peter and your real name in /etc/passwd is Peter Bassill, John tries peter, Peter, Peter1, peterbassill, bassill, 1bassill, pb, and many other obvious variants.

Wordlist mode: tries every word in a dictionary file. The bundled English dictionary is about 20,000 words. Combined with a rule set — capitalise the first letter, add a digit at the end, replace o with 0, etc — the effective number of candidates is millions.

Incremental mode: exhaustive search through all possible passwords, in a smart order based on character frequency in real password collections. This is the slowest mode but it eventually finds anything.

What I learned about the passwords I had

The three passwords John found in the first two hours were:

friday1. A friend's password. Single-mode would have got it instantly; wordlist with simple rules certainly did.
Coral2000. Another friend's. The friend works at Gala Coral, hence the word; he had been told to add a number for "complexity". Wordlist with rules.
password1. Mine. From an old test account I had set up years ago and forgotten. Wordlist alone.

The additional three found by morning included one that was a person's spouse's name plus their birth year, and two that were single dictionary words with a digit appended.

None of these are sophisticated passwords. All of them looked, to the people who chose them, like they had thought about it. "It's not in the dictionary" — yes it is. "It has numbers" — they appear at the end. "It has uppercase" — only at the beginning. The patterns are, from the perspective of a password cracker, extremely common.

What this exposes

Three things that I want to write down for myself.

Password complexity rules do not produce strong passwords. They produce passwords that appear to satisfy the rules while remaining trivially crackable. Adding a number to password produces password1. Capitalising it produces Password1. Both are in every cracker's standard rule set.

The password file's permissions are the only real defence. Shadow passwords (the /etc/shadow system that hides the hashes from non-root users) is the architecture that makes password security possible at all. If /etc/passwd still had hashes in it (as it did before shadow), John would not need to be running as root, and any user on the system could crack their colleagues' passwords. This is, as a thought experiment, terrifying. Shadow passwords have been the standard for years; their existence is what is keeping password-based authentication tractable.

Network exposure of password hashes is catastrophic. Any path by which a hash gets to an attacker — a database leak, a backup that ends up on the wrong machine, a dump from a compromised server — gives the attacker the same advantage I have when running John against my own file. They have unlimited time, no rate limiting, and millions of candidate passwords per second. Most password databases do not survive this attack.

What I have changed

A short list, applied to my own machines.

Removed the test accounts. The two old test accounts that were left over from past experiments are gone. They had weak passwords and contributed nothing.

Forced password changes for the friends whose passwords I cracked. I emailed them to say what I had done, why, and that they needed to choose new passwords. I gave them links to Diceware — Reinhold's word-list-based passphrase scheme — and an explanation of why short passwords with mutations are weaker than long passphrases of unmutated words. Two of them have done it; the third is grumbling.

Switched myself to a passphrase scheme. Six random words from a dictionary, with spaces, gives me about 75 bits of entropy. This is well above any practical cracker's reach. The only downside is typing length, which is not zero but is acceptable.

Disabled password authentication for SSH — already done a while back but worth saying again. With key-based authentication, the password is essentially decorative. The path from "compromised hash" to "compromised account" requires the password to be useful for something, which, in a key-only environment, it is not.

A small experiment for any sysadmin reading this

Run John against your own password file. Not against your users' production passwords without their knowledge — that is a serious privacy violation in any organisation — but against a small set you control, or your own personal accounts.

The results will probably surprise you. They will, in particular, surprise you about how short the gap is between "this looks like a strong password" and "this is in a dictionary plus rules".

The exercise is the cheapest way I know to recalibrate one's intuition about password strength. It is also, as a piece of operational discipline, the kind of thing that should be on every sysadmin's quarterly list.

Mine now is.