AI in cyber: the long view from 2018

Start of a six-weekly series tracking how AI in cyber security is developing through 2024 and beyond — and how each development reads against EmilyAI, the SOC analyst I have been running in production at Hedgehog since 2018.

The AI hype cycle reached cyber security in 2023 with the same enthusiasm it reached every other industry. Vendors who had not previously used the word AI in their marketing did so for the first time. Boards who had been content with annual cyber updates suddenly wanted AI-specific briefings. Procurement budgets were rebadged, and analyst day-jobs started getting framed as will the AI take it.

I have been running an AI system in production cyber operations since 2018. Her name is EmilyAI, and she is a SOC analyst that handles the tier-two triage work for Hedgehog Security's managed detection and response customers. She is not a chatbot, not a generative model, not a wrapper around a hyperscaler API. She is a domain-specific classifier pipeline running on hardware we own, with deterministic 8-bit inference end-to-end, a hexagonal architecture that decouples her from any particular SIEM or case-management vendor, and a continuous learning loop that turns every analyst-closed case into a labelled training example.

This series is what I see when I read the AI-in-cyber news cycle through that lens. Every six weeks for the next two years or so, I will write about what has happened in the wider AI-and-security world and how it reads against the architectural decisions we made in 2018 and have lived with since.

Why this is worth writing

Three reasons.

The first is that most of the public AI-in-cyber discourse is being written by people who either have not built a production AI system in the security domain, or who are running general-purpose LLM wrappers around someone else's API. Both are legitimate, neither is the only thing the field looks like, and the perspective from the system that has been making real triage decisions on real customer traffic for six years is, I think, useful to record.

The second is that the architectural decisions we made in 2018 — agnosticism via hexagonal pattern, determinism by quantisation discipline, continuous learning from analyst feedback, structural privacy through tenant-tagged everything — are decisions the rest of the field is now arriving at, in some cases urgently, after the LLM wave has exposed the cost of not having made them earlier. It is worth saying so without smugness, because the field is moving fast and the lessons that look obvious now were not obvious in 2018.

The third is that the comparative frame — here is what the field is doing, here is what we have been doing, here is the difference — is the most useful way I have found to think about new entrants. Some of them are doing things we have not. Some are doing things we already do. Some are doing things we deliberately decided not to. The series will be honest about all three.

What EmilyAI is, in a paragraph

For people who have not encountered her before. EmilyAI is the analyst core of a platform that reads from a customer's SIEM (Splunk, Sentinel, Elastic, QRadar, others) via a connector that translates the vendor's native event format into a single canonical internal schema; runs that schema through a three-stage funnel (deterministic suppression, CPU pre-triage on a quantised distilled model using Intel AMX, full GPU inference on NVIDIA Triton across two L40S accelerators with a champion-and-challenger model layout); produces a structured verdict; writes the verdict into the customer's case-management system via an outbound connector; and closes the loop by ingesting the analyst's eventual disposition as a labelled training example. The platform is deployable as UKCD-hosted SaaS, on a single Dell PowerEdge R760 dropped on-premises at the customer, or as managed bare-metal in a specialist data centre. Everything runs on hardware we own or lease, in jurisdictions we control, with cryptographically chained audit logs and mutual TLS everywhere.

The architecture is described in detail in our technical due diligence document, which I will reference periodically through this series rather than reprint each time.

What I will not do in this series

A few disclaimers, since the AI conversation tends to attract them.

This is not a sales pitch. I will mention EmilyAI by name throughout because she is the comparator that lets me make specific claims rather than vague ones; readers who want to engage commercially will find the contact route easily enough, but the posts themselves are not asking for that.

This is not an LLM-bashing series. Large language models are remarkable systems and they have real uses in security operations. Several of the systems I will discuss are LLM-based and some of them are doing useful work. The criticism I will offer of vendor framings will be specific to the framings, not to the underlying technology.

This is not a year-zero series. The AI field has been working on security applications for decades. EmilyAI is not the first production AI SOC analyst — there were rule-based and machine-learning systems in commercial use for a long time before her. The series will respect that lineage where it matters.

The schedule

Posts every six weeks. Some will be focused on a specific development — a vendor announcement, a regulatory shift, a research result. Some will be more reflective. Year-end posts in December will be retrospectives. The series will probably run for two years, after which I will reassess whether the field has stabilised enough to make the comparative frame less useful.

What this month looks like

For the next post in mid-February: a piece on deterministic inference and why it is the property most easily lost in the LLM era. The next AI security product you evaluate, ask the vendor whether the same input produces the same output every time. The answer will tell you a great deal.

See you in six weeks.