ChatGPT · Peter Bassill

OpenAI's ChatGPT release on the 30th of November is the kind of capability inflection that produces a substantial subsequent strategic conversation in adjacent fields. The chat-based interface to a substantially-improved GPT-3.5-class language model has, in two weeks of public use, produced a wider exposure to current language-model capability than several years of preceding work has. The conversation about what language models can and cannot do has shifted substantially, both inside the security industry and outside.

The substantive capability picture. The model is, on the public-and-research-community evaluation, substantially better at instruction-following, at coherent multi-turn conversation, at code-related tasks, and at what is sometimes called "reasoning" tasks (within a deliberately quoted scope) than the publicly-accessible language models of 2021 and earlier. The capability is not omniscient, hallucinates substantively in specific patterns, and produces outputs that are, on careful reading, not always reliable. The capability is, however, sufficient to produce outputs that are useful in many domains where careful-reading-with-correction is the appropriate workflow.

The security-relevant implications are several and are being worked through across the security-research community over the past two weeks.

First, the offensive-side use of language models for social-engineering content production. The structural concern is that the marginal cost of producing convincing phishing-and-pretext content is, in the post-ChatGPT environment, substantially lower than it was. The aggregate effect on the volume and quality of social-engineering attacks is uncertain in scale but is directionally negative for the defensive side. The customer-portfolio briefing material has been incorporating this dimension for the past two weeks.

Second, the defensive-side use of language models for SOC-and-detection-engineering work. The opportunity to use language-model capability for analyst-assistant functions, for natural-language-querying of SOC data, for detection-rule generation from operational descriptions, and for various other detection-engineering tasks is substantive. The EmilyAI engineering team has been thinking through the appropriate integration of language-model capability into the existing alert-triage architecture; the early conversations are about additive integration that preserves the existing model's confidence-and-explainability properties rather than wholesale replacement.

Third, the broader software-development environment. Code-completion-and-suggestion tooling based on language-model capability (GitHub Copilot most prominently) is now substantially mature and is in production use across many development organisations. The security implications of language-model-suggested code — the propagation of subtle vulnerabilities through training-data-derived suggestions, the licensing-and-attribution questions, the aggregate effect on developer-skill development — are not yet fully understood and will be a substantive research-and-policy conversation through 2023 and beyond.

Fourth, the regulatory environment. The post-AI-Act-progress EU regulatory framework is going to apply to language-model deployments that fit within the various risk-categorisations the framework defines. The customer-organisation programme work will, through 2023-2024, need to incorporate AI-system regulatory considerations into the broader compliance landscape. The customer-organisation conversations about AI regulation have been, until recently, relatively abstract; the post-ChatGPT environment has made the conversations substantively more concrete.

The personal-strategic note. I have been thinking about the language-model implications for the EmilyAI product roadmap through the past two weeks more substantively than the year-end plan envisaged. The opportunity to integrate language-model capability into the existing product surface — not as a wholesale architectural replacement, but as a specific capability addition for natural-language interaction with the SOC data, for explainability of model classifications in accessible language, and for assistant-style support of analyst workflows — is real and is the principal Q1 2023 strategic question. The team will be working through the appropriate integration through Q1 with the v3.2 release as the operational target.

I will write more on this through 2023. The language-model environment is going to be a substantial strategic theme.