1.2 billion credentials · Peter Bassill

The Hold Security disclosure on Tuesday — that a Russian group they have been tracking has accumulated approximately 1.2 billion username/password pairs from approximately 420,000 websites — has produced more debate about disclosure economics than about the underlying credentials. The credentials themselves are, on Hold's account, a mix of older breach material, new SQL-injection-driven extraction from smaller sites, and various incremental dump aggregations. The 542 million unique email addresses figure is the operational headline. The structural question that has occupied the security-research community over the past five days is not whether the disclosure is real — it appears to be — but whether the way Hold has chosen to disclose it is operationally helpful or operationally exploitative.

The criticism of Hold's framing is substantive. The disclosure was made in a press release aligned with the New York Times' coverage; Hold simultaneously launched a commercial identification service ("we will tell you for $120 per month whether your accounts are in our archive") and a slightly-mocked free service for users to enter their email addresses to be checked. The combination of "we have a billion credentials" and "we offer a paid service to identify your exposure" produced an uncomfortable resemblance to security-vendor breach-marketing of the kind that has been increasingly common over the past three years. Brian Krebs's coverage has been the right primary source for the disclosure-economics debate. The technical question of whether Hold's archive is real and substantial — they have demonstrated samples to journalists, and the broader analyst commentary has been broadly persuaded — is now mostly settled in favour of "yes". The question that remains is what the disclosure actually achieves for users whose credentials are in the archive, and whether the commercial framing serves users or the security-vendor's revenue.

For the engagement work, the post-disclosure conversation has been focused on the practical question of how to act on a disclosure of this kind without making the ground-truth situation worse. The honest answer is uncomfortable. For users whose email addresses are in the 542 million, the appropriate response is to assume that whatever password was used at whichever site has been compromised and to rotate the password at that site and at any other site where a similar password is in use. For organisations whose user-bases overlap with the 542 million, the appropriate response is to monitor authentication patterns for credential-stuffing attacks (which I have been writing about since the LinkedIn dump in 2012) and to consider whether forced password resets are warranted. The LinkedIn-style detection content I wrote for the SOC two years ago has been the operational response template for the past week.

The wider point — and this is the part that has been preoccupying me for longer than the Hold disclosure deserves — is that the credential-database-as-a-tradeable-commodity category has now reached operational maturity. There are, on the public-and-private accounts that have surfaced in the past three years, approximately 5-10 billion credentials in active circulation through underground markets, with substantial overlap and substantial duplication. The 1.2 billion Hold are reporting is a subset of this larger inventory. The marginal disclosure value of any single new credential aggregation is therefore lower than the headline number suggests; the structural problem is the inventory itself, not any single addition to it.

For the Hedgehog SOC, the credential-stuffing detection content has been continuously developed since the LinkedIn era, and the post-Hold updates have been incremental rather than structural. The clients whose authentication infrastructure we monitor have, on the past week's data, seen roughly a twenty-percent uptick in credential-stuffing-attempt volume, which is consistent with new tooling being shipped against the freshly-public inventory but is not, on present evidence, the substantial spike that some commentary had predicted.

The wider piece I have been working on through July — about the operational economics of the underground credential market — is approximately three thousand words at draft stage and will land in late August or early September depending on what else surfaces. The structural argument is straightforward but is harder to land at the engagement-client level than I had initially thought. The argument is that "use unique passwords" advice is operationally insufficient at the institutional scale, that two-factor authentication is the structural answer, and that the cost of two-factor deployment for consumer-facing services has dropped to the point where it is now operationally indefensible to not deploy it. The clients are agreeing with the argument and not deploying. The institutional friction is mostly about user-experience cost. I do not have a clean answer for this beyond "the user-experience cost is lower than the breach cost, and is being incurred regardless".

The next post is probably a continuation of the credential-economics piece I am drafting, or whatever falls out of the GameOver Zeus follow-on activity that several of my correspondents are seeing in client networks.