The default offensive engagement is a red team. Once a year, perhaps. Six weeks of clever people doing clever things, followed by a debrief, a slide deck, and a return to business as usual. It is better than nothing, and it is what most organisations can budget for. But it is not where the durable defensive lift comes from.
Where the durable lift comes from
The durable lift comes from making the dialogue between offensive and defensive a continuous one, with the same people, on the same estate, against the same agreed scenarios — and from organising the work so that the findings drive a small, accountable backlog of changes that the defensive function executes between engagements. That is what purple means when it stops being a marketing word.
I have seen organisations get a year's worth of remediation effort out of a single quarterly purple cycle, where the previous year's annual red team produced a heroic-looking report and almost no concrete change. The difference is not the talent of the teams. The difference is the loop.
The loop, in concrete terms
A purple cycle, as I run it, has four phases. The first is scenario selection, which is short — usually a half-day workshop where the offensive lead and the defensive lead pick three or four named scenarios from the threat profile, agree the success criteria, and agree the rules of engagement. The scenarios are written in plain English: we will simulate an externally-originated phishing campaign that delivers a credential-theft payload to the finance team, and we will attempt to use the resulting credentials to access the consolidated reporting system. That is the scenario. Everything else is implementation detail.
The second phase is execution. The offensive team runs the scenario at a realistic operational tempo. The defensive team is on its normal duty roster — no special heightened alert. Crucially, the offensive team narrates its actions in close to real time, into a shared timeline that only the leads can see. That timeline is the artefact the next two phases hang on.
The third phase is the joint debrief, which I prefer to run within a week of execution while the actions are still fresh. Both teams sit in the same room and walk the timeline. For each step, the question is: was this seen, was it actioned, and if not why not? The answers go into a shared spreadsheet — a flat one, no fancy tooling — with a column for required change and a column for owner. By the end of the debrief, you have a remediation backlog that both sides agreed to.
The fourth phase is remediation tracking, which is the unsexy bit and the bit that most organisations skip. Each item from the spreadsheet has an owner, a date, and a verification step (often a re-run of that step in the next cycle). If an item has not been actioned by the next cycle, the next cycle starts by asking why, in front of both teams. The social pressure of that question is, in my experience, more effective than any project-management tooling.
Why the cultural framing matters
The hardest bit of running a purple programme is not the technique. It is the culture. Red teamers are very used to keeping their tradecraft close to their chest, because surprise is a value in their craft. Blue teamers are very used to being judged on their detections, because metrics are a value in theirs. A purple loop asks both sides to give that up, in exchange for a shared improvement curve.
The signal that you have got the culture right is when the offensive team starts contributing to the detection backlog without prompting, and the defensive team starts asking the offensive team to retest specific scenarios after specific changes have shipped. When that two-way traffic is present, the function is purple. When it is not, you are still running parallel reds with a more cordial debrief.
Resourcing it
Most organisations cannot afford a permanent in-house red team. That is fine. A purple cadence is perfectly executable with an external offensive partner, provided the partner is willing to be a participant in the loop rather than a vendor of reports. Vetting that willingness during the procurement process is critical. The signal I look for is whether the partner can articulate, before the work starts, what successful remediation would look like for a given scenario. If the answer is fluffy, the partnership will be fluffy. If the answer is sharp, you have found someone who will be useful for the long term.
A final, practical observation
The single highest-value purple scenario I have ever run, repeatedly, is test the detection of credential abuse from an unusual source IP. It is unglamorous. It is not on any threat-actor brochure. And in roughly half the engagements I have used it for, it surfaces a meaningful gap on the first run. Sometimes the most valuable scenarios are not the ones that look impressive in the cover slide; they are the ones that test the controls you have most quietly come to take for granted.
Run it boringly, run it often, and let the findings compound.
Related reading
If this piece was useful, the most directly adjacent posts on the site are:
- Emulating adversaries you actually have
- Detection engineering: from rule writing to engineering discipline
- Building a defence function people don't quietly leave
The skills page groups all ten companion articles by area of practice, and the experience page covers the engagements that the practice was shaped by.