BSides Manchester: notes from the talk

BSides Manchester yesterday. The talk — "What two and a half years of analyst-decision data taught us about SOC alert triage" — went, on the audience-feedback measure, well. Forty-five minutes, three case studies (the playbook-drift detection, the threat-intelligence-integration improvement, the analyst-staffing-forecast capability), and the kind of question-and-answer session afterwards that confirms an audience has engaged with the substantive material rather than just the slide titles.

The talk drew on the USENIX Security 2019 paper material that the lead engineer is presenting in Santa Clara next week. The BSides version was differently shaped — less academic-format, more practitioner-oriented, with operational examples and customer-side outcomes that the academic paper avoided for confidentiality reasons. The audience at BSides Manchester is, on the historical pattern, predominantly UK practitioners running their own SOCs or providing SOC services to UK customers, which is the right audience for the practitioner-oriented version of the material.

Three things from the question-and-answer that are worth recording.

First, the question that came up three times in different forms: how does the analyst-decision-data-driven approach handle the cold-start problem for customers who do not have substantial historical analyst-decision data? The honest answer is that the cold-start case is harder than the warm-start case (where the customer has an existing SOC with analyst-decision history) and that the bootstrap process for a new customer requires either (a) trained on aggregated cross-customer data with subsequent customer-specific tuning, or (b) extended initial period of shadow-mode operation while customer-specific decision data accumulates. Both approaches are operational in the EmilyAI commercial deployment; the cross-customer-aggregate-then-tune approach is the default and works adequately for customers whose alert-rule mix is broadly similar to the population we have aggregated training data for. The atypical-rule-mix cases (industrial-control monitoring, payment-fraud-specific rule sets) require longer shadow-mode periods.

Second, the question about the ethics and the labour implications of automating analyst decisions. This is a question I have been asked at every venue I have presented this work at, and the answer has, over the past two years, become more substantive rather than more dismissive. The short version: the model assists rather than replaces analyst judgement, the assistance reduces analyst workload on lower-value triage rather than displacing analysts from higher-value work, and the customer organisations who have deployed the model report that they have been able to redirect analyst time to detection-engineering and incident-response work that was previously underserved rather than reducing analyst headcount. The longer version is more nuanced — the model is, structurally, a productivity-multiplier capability, and the customer-organisation decisions about whether the resulting productivity is converted into reduced headcount or expanded capability are organisational decisions that the technology does not make. I have been clear about this in customer engagements and in the public talks. The question is a fair one and deserves the substantive answer.

Third, the question about the model's handling of attacker adaptation. If an adversary knows that a customer's SOC is using ML-assisted triage, can the adversary craft attacks to evade the model's classification? The answer in 2019 is "in principle yes, in operational practice we have not yet seen evidence of model-aware adversarial behaviour against the EmilyAI deployments". The reasons are partly that the model's decisions are advisory rather than authoritative — a model-evading classification still results in analyst review, so the evasion does not produce silent compromise — and partly that the adversary's attack-development cost-benefit calculation does not, in the typical case, favour customisation against a specific customer's triage tooling. The adversarial-machine-learning research literature (Goodfellow et al on adversarial examples, ICLR 2015, and the substantial subsequent work) does, however, indicate that this assumption may not hold indefinitely, and the engineering team is investing in adversarial-robustness work as a longer-term concern.

The networking afterwards was useful. Two of the customer-organisation security leads I have known for some years were at the talk and we had useful conversations about their own SOC tooling considerations. One of them is a current pen-testing-engagement customer and may be a vCISO conversation in 2020. One of them runs a security team for an organisation we have not previously engaged with and the conversation produced a useful introduction. The BSides networking quality continues to be substantially higher than the major-vendor-conference networking quality, and the cost-benefit calculation for these events continues to be favourable.

The travel back was uncomplicated. The personal-blog reader will have to put up with the fact that the BSides post is partly trip-report and partly substantive — that is the structural shape of the practitioner conference circuit and it will be reflected in the writing as it has been before.