The hexagonal lesson: vendor agnosticism as structure

Post 4 of the AI series. Most security AI products are anchored to one vendor's platform. EmilyAI was built in 2018 with a hexagonal architecture that decouples the analyst from the SIEM matrix. Six years on, the choice is paying back in a way I did not anticipate.

_Post 4 of the AI in cyber series._

A specific architectural decision we made in 2018 has been paying back, in 2024, in ways I did not anticipate at the time. The decision is the hexagonal pattern — ports and adapters, in Alistair Cockburn's original formulation — applied to the security AI platform. The thesis at the time was that we should never have to rebuild the analyst because a customer used a different SIEM. The benefit in 2024 is much wider than that.

This post is about why architectural agnosticism, in security AI, is a structural property worth designing for and not an implementation convention. The comparison to the current vendor landscape is, here, more pointed than in earlier posts.

What the hexagonal pattern actually is

Briefly, for readers not in the architectural-pattern world.

A hexagonal architecture separates the core of a system — the business logic, the decision-making — from the adapters that translate between the core and the outside world. The core knows only one schema, one set of operations, one set of internal types. Adapters on the inbound side translate the outside world into that internal schema; adapters on the outbound side translate the core's output back into whatever the outside world expects. The core does not know which adapter is on the other end. The adapters do not know each other.

The pattern's appeal in security AI is structural. A SIEM-agnostic analyst can read from Splunk, Sentinel, Elastic, QRadar, Sumo Logic, or LogScale because there is one adapter per SIEM that translates each vendor's event format into a single canonical internal schema. The analyst sees only the canonical schema. The analyst does not know which SIEM produced any given event. The connector matrix becomes the product surface area; the analyst's own development is decoupled from it.

EmilyAI's hexagonal application

In EmilyAI's case, there are three rings of adapters.

The inbound ring has one connector per supported SIEM. Each connector reads events from its source, translates them into the canonical schema described in our technical due diligence document, and pushes them onto a Redis Stream that the analyst core consumes. Adding a new SIEM is a connector-development job, not an analyst-development job.

The outbound ring has one connector per supported case-management system (ServiceNow, Jira, Cyderes, in-house tools). Each takes the analyst's structured verdict and writes it into the target system, capturing the case identifier the target system assigns so that subsequent activity can be appended rather than duplicated.

The interaction ring — added in schema version 1.2 — covers Slack, Microsoft Teams, Discord, Matrix, voice, SMS, and email. These were originally framed as notification surfaces in the outbound ring but were recognised as bidirectional conversational surfaces and promoted to their own ring with dedicated orchestration.

The analyst core sees the canonical schema and nothing else. The customer-visible product is the connector matrix; the analyst's own roadmap is decoupled from it.

Why this is paying back in 2024

When the architecture was designed, the practical benefit was we can sell to customers with different SIEMs. That benefit has been real. The benefits that have emerged since are, in retrospect, more interesting.

The schema is the documentation. Because the canonical internal schema is the single point of coupling, the schema is also the most useful piece of internal documentation. New engineers learn the system by learning the schema. New connectors are scoped by what the schema requires. Cross-team conversations converge on the schema. The schema versioning discipline (v1.0, 1.1 cross-tenant, 1.2 interaction-and-hunting) gives a clean handle on backward compatibility.

Replay is free. Because the canonical schema is what is persisted in MySQL and Redis Streams, replaying historical events through a new model version is straightforward. We can re-run last week's traffic through a candidate analyst model without going back to the upstream SIEMs. This makes model rollouts dramatically less risky than they would otherwise be.

Customer churn is structural, not catastrophic. A customer changing SIEMs — which happens; the migration from on-prem Splunk to Sentinel was a meaningful pattern through 2022-23 — does not require us to rebuild anything. Their previous SIEM connector is replaced with a new one. The analyst is unchanged. Their historical case data is unchanged. The customer's switching cost is in their own ecosystem, not in ours.

The cross-tenant intelligence model is clean. Adding the cross-tenant fleet intelligence (schema v1.1) was a clean extension to the canonical schema rather than a re-engineering of the analyst. The privacy controls that govern cross-tenant data flow are structurally enforced through the schema itself — every record has a tenant identifier, every operation is tenant-filtered, raw events never cross the boundary.

What the wider market did instead

Most of the current wave of AI security tooling does not have this structural property. The pattern is, broadly:

The platform vendor's AI is anchored to the platform vendor's own data and runtime. Microsoft Security Copilot lives inside the Microsoft security platform. CrowdStrike Charlotte lives inside CrowdStrike. The advantage is integration depth; the cost is portability — you do not take Charlotte with you when you leave CrowdStrike.

The independent AI security vendors often connect to multiple platforms, but the integration is per-platform engineering rather than a clean adapter pattern. The architecture is we have built connectors rather than the connector matrix is the product. The distinction matters because the first decays under entropy and the second is durable.

The MSSP-built AI tools tend to be the most vendor-locked of all, because they are built around the specific tooling stack of the MSSP's own delivery.

Why this matters for customers

Three implications.

The platform you choose now constrains your AI choices for the next decade. If you adopt a platform copilot, you have implicitly agreed that AI in your SOC is platform-anchored. Migrating away involves losing the AI investment alongside the platform investment.

Vendor agnosticism is, for an enterprise that runs multiple security platforms, a structural requirement. Most enterprises do run multiple platforms — SIEM, EDR, identity, email, cloud workload protection, network detection. An AI security product that works on one but not the others solves only a slice of the problem.

Architectural agnosticism in the vendor is not a marketing claim; it is a structural property the vendor can demonstrate. Ask to see the schema. Ask how a new SIEM connector gets added. Ask whether the analyst depends on any vendor-specific logic. The answers will reveal whether the agnosticism is structural or aspirational.

What is next

In six weeks: open-source models and the on-prem option. The Llama 3 release in April has changed what is operationally available to security teams that do not want to send their data to a hyperscaler. The economics, the practical limitations, and what it does and does not do that EmilyAI does.