HTTP authentication, done badly

I have been auditing web applications for friends fairly regularly over the past six months. The most consistent finding across the audits has been that the authentication layer is the most often-broken layer in any web application.

This is not a new observation. The Bugtraq archives are full of authentication-bypass advisories going back years. What is striking is how consistent the mistakes are across different applications written by different teams. The same broken patterns recur.

This post is the catalogue. Bad pattern, why it is bad, what the correct approach is. Worked examples in pseudocode rather than any specific language so the principles are visible.

Pattern 1: storing passwords in plaintext

The bad pattern:

store_user(username, password):
    INSERT INTO users (username, password) VALUES (username, password)

verify_user(username, supplied_password):
    SELECT password FROM users WHERE username = username
    return supplied_password == stored_password

The database contains plaintext passwords. Anyone who reads the database — through SQL injection, backup theft, ordinary administrative access, or any of the dozen other ways — gets every user's password.

The correct pattern: hash the password before storage; compare hashes.

store_user(username, password):
    salt = random_bytes(16)
    hash = hash_function(salt + password)
    INSERT INTO users (username, salt, hash) VALUES (username, salt, hash)

verify_user(username, supplied_password):
    SELECT salt, hash FROM users WHERE username = username
    return hash_function(salt + supplied_password) == hash

The salt prevents pre-computed hash tables ("rainbow tables") from being effective. The hash function should be slow on purpose — crypt() with a bcrypt algorithm is the right choice in 2000. MD5, despite being widely used, is too fast to be safe for password storage.

Pattern 2: storing session IDs predictably

The bad pattern:

on_login(user):
    session_id = next_id_in_database
    write_cookie(session_id)

Session IDs are sequential integers. An attacker who legitimately logs in and gets session ID 12345 knows that 12344 was probably someone else's recent session. They guess at session IDs and try them; eventually they hit a valid one.

The correct pattern: session IDs should be cryptographically-random, long enough to be unguessable.

on_login(user):
    session_id = random_bytes(32).hex()  # 256 bits of entropy
    record_session(session_id, user, current_time)
    write_cookie(session_id)

256 bits of entropy is far more than necessary; it is also essentially free. Pick a generous length and use the cryptographic-strength RNG, not the language's default random().

Pattern 3: client-controlled trust

The bad pattern:

on_request(request):
    if request.cookie.contains("role=admin"):
        # treat as admin

The client's cookie carries a role flag. The attacker modifies the cookie to claim role=admin and the server believes them.

This happens more often than it should. Variations include client-controlled user IDs, client-controlled access lists, and the classic infinite-loop pattern of "are you authenticated?" cookies that are simply set to true.

The correct pattern: trust comes from the server's session table, not from the client's cookie.

on_request(request):
    session_id = request.cookie["session_id"]
    session = lookup_session(session_id)
    if session.user.role == "admin":
        # treat as admin

The cookie is a handle — it identifies which session to look up. The actual trust information is in the server's session record, which the client cannot modify directly.

Pattern 4: timing-based authentication leaks

The bad pattern:

verify_user(username, password):
    user = lookup(username)
    if user is None:
        return False
    return password == user.password

The response time is measurably different for "username does not exist" (one database query, no password comparison) and "username exists, password is wrong" (one database query, one comparison). An attacker can determine whether a username exists by timing the response.

The correct pattern: make the response time independent of which side the failure came from.

verify_user(username, password):
    user = lookup(username) or DUMMY_USER  # always perform comparison
    is_valid = constant_time_compare(
        hash_function(user.salt + password),
        user.hash
    )
    return user is not DUMMY_USER and is_valid

The constant_time_compare function takes the same time regardless of which byte differs. The dummy user ensures the comparison happens even when the username does not exist.

This level of paranoia is appropriate for the authentication layer specifically. For other layers, timing leaks are usually not exploitable.

Pattern 5: insufficient session expiry

The bad pattern: sessions never expire. Once you log in, you remain logged in until you explicitly log out.

Why it is bad: an attacker who steals a session ID — through cookie theft, packet capture, shared computer — has unlimited time to use it. The window of vulnerability is the entire lifetime of the session.

The correct pattern: sessions expire on absolute time (e.g., 24 hours since login), idle time (e.g., 30 minutes since last activity), or both. Sensitive operations (password change, financial transaction) re-prompt for the password regardless of session state.

The specific timeouts are policy decisions. The principle is that the session has finite validity.

Pattern 6: putting credentials in URLs

The bad pattern: an authentication page that ends with redirect to a URL containing the user's credentials, like ?token=abc123.

Why it is bad: URLs end up in:

  • Browser history.
  • Server access logs (often retained for months).
  • Referer headers when the user clicks an outbound link.
  • Bookmarks that get shared.
  • Screen captures, screenshots, debug output.

A token in a URL is, structurally, public information.

The correct pattern: keep credentials in cookies or in HTTP headers (Authorization), never in URLs. If a token must travel to a third-party site, ensure it is single-use and short-lived.

Pattern 7: vulnerable password-reset workflows

The bad pattern: a password-reset feature that asks for the user's email, sends a link to that email, and accepts whatever password is set when the user clicks the link.

Why it is bad: many ways. A common one is the enumeration leak — the response says "reset email sent" for valid users and "user not found" for invalid ones. An attacker can determine which usernames are valid by trying them.

Another common bug: the reset link contains a token that is reusable. An attacker who somehow obtains the link (via mail interception, mailing-list archive scraping, etc.) can use it any time.

The correct pattern: response is identical for valid and invalid usernames. Reset tokens are single-use, time-limited (15 minutes is plenty), and invalidated when used. The token is at least 256 bits of entropy and unguessable.

Pattern 8: trusting referer / origin headers

The bad pattern:

on_admin_action(request):
    if request.headers["Referer"].starts_with("https://oursite.example/"):
        # treat as legitimate

The Referer header is sent by the browser. Any browser, any client. An attacker constructing a request from outside the site can set the Referer to whatever they want.

The correct pattern: do not use Referer for authorisation. Use a per-session anti-CSRF token, embedded in forms by the server and verified server-side on submission.

A summary discipline

The pattern through all of these is: trust comes from the server, not from the client. Anything the client controls — cookies it sets, headers it sends, URL parameters it constructs — can be modified by an attacker who is the client. The trust state has to live somewhere the client cannot touch.

For my own work: I am not currently writing applications, but the friends I help with audits get the same advice. The bad patterns are not new; they are the result of building authentication ad-hoc rather than using a well-tested library. The discipline of using a library — even a simple one — for authentication is more important than any specific implementation choice.

More on the year as it develops. The next post will be a small operational note on a tiny incident I had with my own infrastructure.


Back to all writing