Security Is Job Zero — Even (Especially) in the Age of Coding Agents
$20 and Two Hours
On February 28, 2026, security startup CodeWall gave an autonomous AI agent a single input: a domain name. Two hours and approximately $20 in API tokens later, the agent had full read/write access to the production database of McKinsey’s internal AI platform, Lilli [1] [2].
The attack vector? SQL injection — a vulnerability class from the 1990s. But in a novel context: the injection was in JSON keys, not values, which standard security scanners missed [3].
What was exposed: 46.5 million chat messages in plaintext, metadata for 728,000 files, 57,000 user accounts, and — most critically — the system prompts controlling AI behavior for 43,000 users [4].
McKinsey patched within hours. A third-party forensic investigation “identified no evidence that client data or client confidential information were accessed” [3]. But the access was real, and the implications are what matter.
Prompts Are the New Crown Jewels

$20 in tokens. Two hours. Full database access to a Fortune 500 consultancy’s AI platform.
The database breach is alarming. But the most significant finding is what the attacker could have done with write access: silently rewrite the system prompts that control how Lilli behaves for 43,000 users.
You might think: “Prompts are just configuration. We’ve always needed to protect config files.” But prompts are different. A misconfigured server returns errors. A poisoned prompt returns plausible-looking wrong answers — financial models with subtle errors, strategy recommendations with hidden biases, compliance advice with missing caveats. The output looks normal. Users trust it. That’s what makes prompt compromise uniquely dangerous.
One SQL UPDATE statement. No deployment. No code change. No log trail. Just different AI behavior.
As Promptfoo’s analysis noted: “The model became the interface to a compromised application” [5]. The model didn’t need to be jailbroken — the surrounding system fed it altered instructions. Most “AI security incidents” will follow this pattern: they’ll start as familiar software bugs and end as changes in model behavior.
The Real Problem: We’re Building Faster, Not Safer
Here’s what bothers me about the McKinsey story. It’s not that a decades-old vulnerability existed in a modern AI platform. It’s that this is the predictable outcome of how we build software today.
We’re in the middle of an unprecedented acceleration. Coding agents like Kiro, Claude Code, and Codex are generating code faster than ever. I wrote about composing teams of agents with different focus areas [6] — research agents, coding agents, testing agents. The productivity gains are real.
But where’s the security agent?
In the traditional SDLC, security was already an afterthought for many organizations. If you were lucky, you engaged security experts during the design phase. If you had budget, you contracted an external penetration test just before go-live. Mature teams practiced DevSecOps and shift-left security — but even they struggled with coverage. Manual security reviews couldn’t keep pace with release velocity, and annual pentests left 364 days of untested changes.
Now we’re building with coding agents that can produce a working application in hours. The time between “idea” and “deployed” is collapsing. And the security practices that were already inadequate for a months-long development cycle are completely absent from a days-long one.
The GitGuardian report tells the story: an 81% surge in leaked secrets as AI-assisted development democratizes coding [7]. More code, faster, by more people — with less security review than ever.
The Security Agent: A Permanent Team Member

The security agent: a permanent team member, not an annual visitor.
Here’s the idea: if we’re composing teams of AI agents — a coding agent, a testing agent, a documentation agent — we should add a security agent as a permanent team member.
Not a gate at the end. Not an annual pentest. A continuous presence that:
- Challenges the design — reviews architecture decisions for security implications before code is written
- Scans continuously — runs SAST, DAST, and dependency checks on every commit, not just before release
- Runs penetration tests constantly — the same kind of autonomous probing that CodeWall demonstrated, but on your own code, all the time
- Monitors for prompt mutations — watches for unauthorized changes to system prompts, RAG configurations, and model parameters
- Validates authentication and authorization — checks that every endpoint requires auth, every object access is authorized (the exact failures that enabled the McKinsey breach)
This is already happening. Cursor published their internal security agent templates in March 2026 — autonomous agents that automatically resolve security issues in their codebase [8]. Aikido’s “Infinite” product embeds AI-driven penetration testing directly into the SDLC, triggering on every code change [9]. Opsera launched AppSec AI Agents specifically designed for the transition from traditional SDLC to what they call “AI-SDLC” [10].
AWS itself has entered this space with AWS Security Agent (currently in preview) — a frontier agent that covers the full development lifecycle [11]:
- Design reviews: Analyzes architecture documents against organizational security requirements and AWS best practices, turning time-consuming manual reviews into minutes of focused analysis
- Code reviews: Automatically analyzes pull requests against organizational requirements and common vulnerabilities, providing remediation guidance directly in the developer’s workflow
- On-demand penetration testing: Deploys specialized AI agents that discover, validate, and report security vulnerabilities through tailored multi-step attack scenarios — with reproducible proof and ready-to-implement fixes
The architecture behind it is a multi-agent system: specialized agents for scanning, exploration, exploitation, and validation work together, with findings verified through both deterministic validators and LLM-based agents [12]. HENNGE, an early adopter, reported a 90% reduction in testing duration compared to traditional methods.
You Don’t Have to Wait for GA
AWS Security Agent is in preview — but you can build a security-aware development workflow today with tools that already exist.
Coding agents like Kiro support skills, subagents, and steering documents that encode behavioral rules into every interaction. This means you can embed security into your agent team right now:
- Steering documents that enforce security-first behavior — rules like “scan for hardcoded secrets before committing,” “check that every API endpoint requires authentication,” or “flag SQL construction that uses dynamic identifiers.” These run on every interaction, not just when someone remembers to check.
- Pre-commit hooks that trigger security scans automatically — the agent doesn’t commit code that fails basic security checks.
- A security subagent that reviews architecture decisions, scans for common vulnerability patterns, and challenges design choices from a security perspective — called automatically during the workflow, not as an afterthought.
- Skill-based security workflows — a “security review” skill that runs OWASP checks, validates auth patterns, and produces a security assessment as part of the standard development flow.
The key insight: security becomes a configuration, not a process. You define the rules once in steering documents, and they’re enforced on every commit, every design review, every deployment — automatically. No extra budget approval, no scheduling a pentest, no waiting for a security team’s availability. The security posture is encoded in the system, not dependent on human discipline.
This is what “shift left” actually looks like in the age of coding agents: security isn’t shifted to an earlier phase — it’s embedded in every phase, enforced by the same agent infrastructure that writes the code.
Traditional: security review at the end, too late and too expensive. AI-SDLC: security agent runs continuously across every phase.
“But That Costs More Tokens”
Yes. A security agent running continuous scans, design reviews, and penetration tests adds to the compute bill. But consider the economics:
- CodeWall breached McKinsey for $20 in tokens
- The average cost of a data breach in 2025 was $4.88 million (IBM Cost of a Data Breach Report)
- An external penetration test costs $10,000–$100,000 and happens once or twice a year
- A security agent running continuously might cost $100–$500/month in additional compute — and catches issues on every commit
The math isn’t close. Continuous automated security is orders of magnitude cheaper than the alternatives — both the reactive cost of a breach and the periodic cost of manual pentests.
More importantly: the security agent doesn’t slow down the release. It runs in parallel with the coding agent. The traditional objection — “security review adds weeks to the timeline” — disappears when the reviewer is an agent that operates at machine speed.
One challenge to watch: alert fatigue. Continuous testing means continuous findings. The same problem that plagues SIEM tools — too many alerts, teams stop paying attention — could undermine a security agent. The key is validated findings with reproducible proof, not just flagged possibilities. AWS Security Agent addresses this by validating vulnerabilities through multi-step attack scenarios before reporting them [12], but the broader ecosystem is still maturing.
What Would Have Prevented the McKinsey Breach
The vulnerability classes here are well-understood. None of the fixes are AI-specific — they’re application security fundamentals that were missed:
1. No unauthenticated endpoints. 22 of Lilli’s API endpoints required no authentication. Basic API gateway configuration — requiring authentication on every endpoint — would have blocked the initial access entirely.
2. Input validation beyond parameterized values. The JSON key injection bypassed standard parameterization. OWASP’s SQL Injection Prevention Cheat Sheet makes the point directly: table names, column names, and sort-order indicators aren’t protected the same way bind variables protect values [13]. Multi-layered input sanitization — not just parameterized queries — is required.
3. Object-level authorization (BOLA). After the SQL injection, the agent found cross-user access — the application returned records without verifying the caller was allowed to see them. Every object access needs authorization checks, not just authentication at the door.
4. Prompt storage isolation. System prompts were in the same database as user data. Prompts should live in a separate, access-controlled data store with versioning and integrity monitoring. A write to the prompts table should trigger an alert.
5. Monitoring for prompt mutations. A single SQL UPDATE to the prompts table should be a high-severity alert. No legitimate workflow modifies system prompts via direct database access.
To be clear: the McKinsey breach would have been prevented by basic AppSec hygiene — authentication on all endpoints and proper input validation. You don’t need an AI security agent for that. But the breach illustrates a broader pattern: as development accelerates and AI platforms add new attack surfaces (prompts, RAG pipelines, model configs), the gap between what needs securing and what gets secured is widening. That’s where continuous, automated security becomes essential.
What This Looks Like on AWS
AWS provides specific services that map to each of these controls — and increasingly, they’re agent-powered themselves:
Securing the application layer:
- AWS WAF in front of all API endpoints — blocks unauthenticated access and common injection patterns
- Amazon API Gateway with IAM or Cognito authorization — no endpoint without auth
- Amazon Inspector for continuous vulnerability scanning across compute workloads
- AWS Security Hub for centralized findings and compliance checks
Securing the AI layer:
- Amazon Bedrock Guardrails [14] for prompt attack detection — jailbreaks, prompt injection, prompt leakage
- AgentCore Identity for authentication and access control of AI agents themselves
- CloudTrail + model invocation logging for anomaly detection on prompt access and model behavior changes
- Separate account for GenAI workloads per the AWS Security Reference Architecture [15] — blast radius reduction
Continuous security with AWS Security Agent:
- Design reviews against organizational security requirements — catches architectural issues before code is written
- Automated code reviews on pull requests — validates against both common vulnerabilities and your custom security standards
- On-demand penetration testing — the same autonomous probing that CodeWall used offensively, but run by you, on your own systems, continuously [11]
The key architectural principle: defense-in-depth with AI-specific layers. The application security fundamentals (WAF, auth, input validation) prevent the entry. The AI-specific controls (Guardrails, prompt isolation, mutation monitoring) limit the blast radius if the perimeter is breached.
Defense-in-depth with AI-specific layers: application security prevents entry, AI controls limit blast radius, security agent validates continuously.
The Bigger Picture

The arms race: attackers use AI agents offensively. Defenders need AI agents to find vulnerabilities first.
Security has always been a catch-up game. Builders create, attackers probe, defenders patch. This dynamic is as old as software itself. But the game has changed: attackers are now using AI agents too.
CodeWall’s agent found a vulnerability that standard scanners missed — in two hours, for $20. The Resilient Cyber newsletter describes this as “the collapse of exploitation timelines” [16]: AI agents capable of performing offensive security operations autonomously, continuously, and at a scale no human team can match. The asymmetry that used to favor defenders (attackers need to find one flaw, defenders need to protect everything) now favors attackers even more — because AI agents can probe everything, all the time, at machine speed.
Security by design is essential. But it’s not enough when the adversary is an AI agent that iterates faster than your security team can review. If attackers are using AI to find vulnerabilities, defenders need AI to find them first. This isn’t optional — it’s an arms race, and falling behind means your systems get tested by someone else’s agent before your own.
The McKinsey breach is a preview. As organizations deploy AI platforms at scale and coding agents accelerate development, the attack surface expands faster than security practices adapt.
The race: whoever finds the vulnerability first wins. AI makes both sides faster.
The answer isn’t to slow down development. It’s to make security as automated and continuous as development itself. If we’re serious about “Security is Job Zero” — an AWS principle I take to heart — then security can’t be a phase in the SDLC. It has to be an agent in the team.
The same AI capabilities that make coding agents productive make security agents effective. The same autonomous reasoning that let CodeWall’s agent find a SQL injection in JSON keys can be turned inward — continuously probing your own systems before an attacker does.
The question isn’t whether to add a security agent to your team. It’s whether you can afford not to.
💬 Does your development team include a security agent? How are you handling security in an AI-accelerated SDLC?
Sources:
[1] CodeWall — “How We Hacked McKinsey’s AI Platform” (February 2026): codewall.ai
[2] The Stack — Paul Price interview, “$20 in tokens” (March 2026): thestack.technology
[3] The Register — “McKinsey AI chatbot hacked” (March 2026): theregister.com
[4] Silicon UK — file metadata vs document access distinction (March 2026): silicon.co.uk
[5] Promptfoo — “McKinsey’s Lilli Looks More Like an API Security Failure Than a Model Jailbreak” (March 2026): promptfoo.dev
[6] My earlier post on composing agent teams — “The Coding Agent That Doesn’t Code”: schristoph.online
[7] GitGuardian — “81% surge in AI-service leaked secrets” (March 2026): citybiz.co
[8] Cursor — “Securing our codebase with autonomous agents” (March 2026): cursor.com
[9] Aikido — “Aikido Infinite: Self-Securing Software” (February 2026): theneuron.ai
[10] Opsera — “AppSec AI Agents for AI-SDLC” (March 2026): prnewswire.com
[11] AWS Security Agent (Preview) — “Proactively secure your applications throughout the development lifecycle”: aws.amazon.com
[12] AWS Security Blog — “Inside AWS Security Agent: A multi-agent architecture for automated penetration testing”: aws.amazon.com
[13] OWASP — SQL Injection Prevention Cheat Sheet: owasp.org
[14] AWS — “Detect Prompt Attacks with Amazon Bedrock Guardrails”: docs.aws.amazon.com
[15] AWS — “Security Reference Architecture for Generative AI”: docs.aws.amazon.com
[16] Resilient Cyber — “How AI Agents Are Rewriting Offensive Security” (March 2026): resilientcyber.io