The single-tin posture: why we still ship on a Dell

Post 15 of the AI series. A single Dell PowerEdge R760, racked at the customer site, running the whole platform — analyst, inference, persistence, audit. The deployment shape the hyperscaler default would have us abandon, and why we have not.

_Post 15 of the AI in cyber series._

I have referred throughout this series to EmilyAI's single-tin deployment posture — a single Dell PowerEdge R760 that runs the entire platform, racked either at our co-location facility (UKCD-hosted SaaS) or at the customer's own site (on-premises deployment). The hyperscaler default would have us abandon this in favour of cloud-native multi-region multi-availability-zone deployment with all the trimmings. We have not, and the reasons are interesting enough to deserve their own post.

This is the post for the procurement architect who is being told that cloud-native AI security is the only sensible deployment shape in 2025, and who wants to know what the other side of the argument looks like.

What the single-tin posture is

The hardware is a Dell PowerEdge R760, a two-socket Intel Xeon Scalable server in a 2U form factor, equipped with two NVIDIA L40S accelerators, 1TB of memory, and a fast NVMe storage tier. The exact specification has evolved across the past few years; the principle has not.

On that one machine, the entire EmilyAI platform runs: the inbound and outbound connectors, the streaming pipeline (Redis), the persistent storage (MySQL plus the raw event archive on local NVMe), the inference tier (two L40S GPUs, with champion and challenger models loaded simultaneously), the management interface, the audit chain, the secrets management. The host is Ubuntu 24.04 LTS, hardened. The application tier runs as rootless Podman containers.

Three deployment topologies use this same hardware specification — UKCD-hosted SaaS at our co-location facility, on-premises at the customer site, and managed bare-metal at a specialist provider (Equinix Metal and similar). The configuration management material that deploys the platform is the same in all three topologies; only the location and the connectivity assumptions differ.

Why we have not gone cloud-native

Three reasons, in the order of how often they come up in customer conversations.

Data sovereignty. A significant fraction of UK enterprise security buyers in 2025 cannot, by regulation or by contract, place their security telemetry in a US-headquartered hyperscaler's infrastructure. The Microsoft Azure UK South region is not the same thing as UK sovereign — the parent company is US-domiciled, the CLOUD Act applies, and the residency claim is defensive rather than absolute. For customers who need this data does not leave UK jurisdiction and is not accessible by a US legal process, the hyperscaler is structurally the wrong answer. The single-tin on-premises deployment, on hardware they own, in a facility they control, is structurally the right one.

Cost predictability. Hyperscaler AI inference is metered. Hardware AI inference is paid up-front and runs to depreciation. For a customer who processes high alert volumes — which is most managed-detection-and-response customers — the up-front-hardware option is meaningfully cheaper at scale, and the cost is predictable rather than usage-dependent. We have done the analysis several times with customers; the hyperscaler is the cheaper option below approximately 10,000 alerts per day and the more expensive option above approximately 50,000. Most of our customer base is above the second threshold.

Operational simplicity. A single piece of hardware, with a deterministic configuration, in a known location, is materially easier to reason about than a distributed multi-region cloud deployment. The blast radius of a fault is bounded by the machine. The recovery path is bounded by the machine. The audit trail is bounded by the machine. For a regulated security workload, the audit and recovery properties of a known-good single physical box are, in my view, structurally superior to the equivalent properties of the distributed equivalent.

What we give up

Three things, fairly stated.

Scale-out elasticity. The single-tin posture cannot, by construction, handle sudden 10x bursts in event volume. The hardware is sized for a customer's known steady-state and known peaks. Bursts above that have to be queued; the streaming bus (Redis) has hours of buffer, the inference tier processes the backlog over the following hour. For most customers most of the time this is fine. For customers with extremely peaky workloads, it would not be.

Multi-region failover. We do not, by default, run a hot standby in a second region. Our recovery model is the configuration management material redeploys onto replacement hardware within four hours, with the persistent state restored from continuous off-host backup. For customers whose RTO requirement is less than four hours, the single-tin posture is the wrong fit; for the bulk of our customer base, four hours is well within the acceptable range.

The hyperscaler-platform integrations that come for free in a hyperscaler. A platform deployed in AWS, GCP, or Azure inherits the platform's identity, secret management, key management, and observability stack for free. We run our own equivalents — HashiCorp Vault for secrets, our own audit chain, our own observability stack. The wheel-reinvention argument is real. The counterargument is that the wheel we have reinvented is one we control and can audit; the platform-provided wheel is one we depend on the platform to operate honestly.

What we have kept that the cloud-native shape would lose

Three properties that I think matter more in 2025 than they did when we made the original design decision.

Predictable inference performance. A dedicated GPU we own has known latency characteristics. A shared GPU in a hyperscaler can have noisy-neighbour effects on a different customer's workload that we cannot see or control. For deterministic INT8 inference with strict latency budgets, our own hardware is the structurally cleaner answer.

Audit-grade physical control. The hash-chained audit log lives on the machine and is continuously shipped off-host to a separate secured location. The machine itself is in a facility whose physical access is documented and audited. The hyperscaler equivalent inherits the hyperscaler's physical access controls — which are robust and which include the hyperscaler's staff in scope. For some customer threat models, this matters.

The customer's ability to bring it in-house. A customer who wants to move their EmilyAI deployment from UKCD-hosted to on-premises does so by re-racking the same hardware specification at their site and pointing the configuration management at it. The transition is a project but it is bounded. A customer migrating a hyperscaler-native security platform to on-premises is rebuilding the platform.

Where the single-tin posture is structurally wrong

A specific case worth flagging.

The single-tin posture is wrong for very large customers whose alert volume cannot be served by a single R760. For these — single multinational customers with hundreds of thousands of devices and tens of millions of daily events — we deploy multiple R760s in a horizontally-scaled topology with a partitioning model. The platform supports this. The shape is several single-tin instances, each handling a partition of the customer's workload, rather than a single fundamentally-distributed cloud-native deployment. This is, again, deliberate.

It is also wrong for very small customers, where the R760 is over-specified. A smaller hardware variant exists for those customers; the architecture is the same. We do not yet offer a hyperscaler-shared-tenant variant and have no current plans to.

The wider point

The hyperscaler default has become so unquestioned in current enterprise architecture conversations that the single-tin posture sometimes needs explaining. The argument for it is not nostalgic. It is structural — for data sovereignty, for cost predictability, for operational simplicity, for audit-grade physical control. For the customer profile that values these properties, the hyperscaler is the structurally wrong answer and the single piece of hardware on the customer's floor is the structurally right one.

This is unfashionable. It is correct.

What is next

In five weeks (slightly off-cadence to align with the autumn regulatory cycle): the Bank of England / FCA / HM Treasury joint statement on frontier AI in financial services, which is expected later this autumn, and what it suggests about how the financial regulator views AI in cyber security specifically.