A read-only AI security reviewer
The previous post laid out what subagents are. This one is a walkthrough of one I actually use, day to day, on every change that touches anything sensitive: the security-review subagent.
It’s a single Markdown file. Twenty lines of frontmatter and prompt, plus a folder of rule documents it consults. The whole apparatus fits in a small wrapper repo that I add to the workspace alongside whatever project I’m working on. Once it’s there, asking the parent agent “review this for security” spawns the subagent, which reads the relevant rules, compares the implementation against them, and returns a severity-ranked report with citations.
Here’s what’s inside, why it works, and the choices that distinguish a useful security reviewer from a noisy one.
The definition file
The whole subagent lives in a single file:
---
name: org-standards-security-review
description: >
Security and privacy review against the org standards in
standards/rules/. Use when implementing or reviewing auth,
APIs, data handling, uploads, crypto, sessions, or infra.
Use proactively for sensitive features.
model: inherit
readonly: true
---
You review work against the org standards in `standards/rules/`.
## When invoked
1. From the parent prompt, infer scope (features, files, or APIs).
2. Identify which topics under `standards/rules/` apply.
Examples: CSRF and auth tokens for state-changing HTTP;
parameterized DB queries for SQL; hardcoded-secrets and
sensitive-data-* for credentials and logs;
file-upload-validation for uploads.
3. Read the relevant `standards/rules/*.md` files and compare
the implementation to the written rules. Cite filenames in
findings.
4. Report by severity (Critical / High / Medium / Low). Each
finding must reference a rule file.
Do not invent stricter requirements than the written rules.
If the rules are silent or ambiguous, say so explicitly.
That’s it. Twenty-something lines. Let me walk through what each part is doing.
The frontmatter
name: org-standards-security-review
The name is what the parent agent uses to invoke the subagent. It also matters for discovery — when the parent is deciding whether to spawn a subagent for a task, it scans names looking for a match against the user’s intent. “Review this for security” matches security-review cleanly.
description: >
Security and privacy review against the org standards in
standards/rules/. Use when implementing or reviewing auth,
APIs, data handling, uploads, crypto, sessions, or infra.
Use proactively for sensitive features.
This is the part most people underweight. The description does two jobs:
- It tells the parent agent when to reach for this subagent. The phrases “Use when implementing or reviewing auth, APIs, data handling, uploads, crypto, sessions, or infra” are not decorative. They’re the trigger conditions. If the parent is editing an auth handler, the description matches and the subagent becomes a candidate.
- It tells the user what to expect. When I’m reading my own subagent list six months from now, the description has to remind me what this thing actually does, so I don’t end up with three security-flavored subagents that all kind of do the same thing.
The phrase “Use proactively for sensitive features” is doing more work than it looks. It tells the parent agent it doesn’t need to wait for me to ask. If the parent is generating code that touches credentials, it should consider spawning the subagent on its own.
model: inherit
Inherit from the parent. The default. Other options exist — you can pin the subagent to a smaller cheaper model for routine review, or to a larger model for deeper analysis — but inherit keeps the surface simple and makes the cost predictable.
readonly: true
This is the most important line in the file.
A read-only subagent cannot write files. It can read, search, navigate, summarize — but the result is purely a report. That changes everything about my mental model. I don’t have to review what the subagent is about to do before it does it; I just review what it found. The set of tasks I’ll happily delegate to a read-only subagent is much larger than the set I’d delegate to a writable one. For review work specifically, read-only is the right default.
It also has a security implication for the security reviewer itself: a read-only review subagent cannot accidentally apply a “fix” it thinks is correct. The fix is my decision. The subagent’s job is to surface the finding.
The body — what the subagent actually does
The system prompt is short on purpose. Four numbered steps, plus a guardrail.
Step one: infer scope. The parent’s prompt is the input. The subagent has to figure out what files, features, or APIs are being reviewed. Sometimes this is obvious from the prompt; sometimes the subagent has to ask itself “the user said ‘review the auth changes’ — which files are those?” and find them. Either way, the first action is scoping.
Step two: identify which rules apply. This is where the subagent earns its keep. The rule directory has dozens of files, each covering one topic — CSRF, auth tokens, parameterized DB queries, secrets handling, file uploads, dependency scanning, and so on. The parent agent doesn’t want to load all of those. The subagent — having scoped the change — picks the relevant ones.
The prompt gives examples to anchor this: “CSRF and auth tokens for state-changing HTTP; parameterized DB queries for SQL; hardcoded-secrets and sensitive-data- for credentials and logs.”* These are seeds. The subagent extrapolates from them based on what it sees in the change.
Step three: read the rules and compare. The subagent opens each relevant rule file and compares the implementation. The crucial detail is “Cite filenames in findings.” Every finding the subagent reports has to point at a specific rule it’s citing. That citation is what makes the output verifiable. I can open the rule the subagent referenced, read it myself, and decide whether the finding is correct.
Step four: report by severity. Critical, High, Medium, Low. The format is fixed. Each finding includes the severity, the file in the codebase where the issue is, the rule file being cited, and a short explanation.
The guardrail at the bottom — “Do not invent stricter requirements than the written rules. If the rules are silent or ambiguous, say so explicitly” — is the line that prevents the subagent from going off the rails. Without it, the subagent will sometimes produce findings based on what it generally believes about security, not on what’s actually in the rule files. That’s worse than useless: it undermines the citations and inflates the report with claims I can’t trace back to anything.
The rule directory the subagent reads
The subagent’s behavior is only as good as the rules it consults. The directory looks roughly like this:
standards/rules/
├── auth-tokens-expiration.md
├── csrf-protection.md
├── csp-headers.md
├── dependency-vulnerability-scanning.md
├── error-handling-with-no-silent-failures.md
├── file-upload-validation.md
├── hardcoded-secrets.md
├── input-sanitization.md
├── least-privilege-principle.md
├── parameterized-db-queries.md
├── password-requirements.md
├── predictable-ids.md
├── rate-limiting.md
├── secure-session-management.md
├── security-events-logging.md
├── security-misconfiguration.md
├── sensitive-data-in-client-side.md
├── sensitive-data-in-errors.md
├── sensitive-data-in-logs.md
├── sensitive-data-in-querystring.md
├── ssrf-protection.md
└── validate-all-inputs.md
Each file covers one topic, in plain Markdown, with concrete dos and don’ts. They aren’t novels. Each file is one or two pages: a short statement of the policy, the reasoning, examples of compliant and non-compliant code, common pitfalls.
The subagent loves this format because it can pick up only what it needs. A change to a SQL query loads parameterized-db-queries.md and skips the rest. A file-upload feature loads file-upload-validation.md and validate-all-inputs.md. The subagent’s context stays small, focused, and citable.
(I’ll write more about how the rule directory itself works in tomorrow’s post — including how it’s a git submodule shared across projects, and why making it portable matters.)
What the output actually looks like
A typical run, on a hypothetical small auth change, comes back looking like this:
## Security review — auth.py changes
**Scope:** apps/api/routes/auth.py, apps/api/services/sessions.py
**Rules consulted:** auth-tokens-expiration.md, secure-session-management.md,
hardcoded-secrets.md, security-events-logging.md
#
## Critical — none
#
## High
1. **apps/api/services/sessions.py:42** — Session token does not
expire. The current implementation issues tokens with no `exp`
claim. See `auth-tokens-expiration.md` §"All tokens must have
a maximum lifetime." Recommend adding an explicit `exp` based
on the policy in that rule.
#
## Medium
2. **apps/api/routes/auth.py:18** — Failed login attempts are not
logged. See `security-events-logging.md` §"Authentication
events." Recommend adding a structured log entry on failure,
without the credential value.
#
## Low
3. **apps/api/routes/auth.py** — Module-level constant
`DEFAULT_REDIRECT` is defined but never used. Not a security
issue; cleanup opportunity. Not cited from any rule.
#
## Notes
- The rules are silent on session token storage location
(cookie vs localStorage). The implementation uses an
HTTP-only cookie, which aligns with general best practice
but is not explicitly required by the consulted rules.
Notice what this output is doing.
Each finding cites a specific rule file. No finding is “from general best practice” — those are explicitly called out as notes, not findings. That separation is what keeps the report trustworthy.
Severity is honest. A “low” finding is genuinely low. The temptation, with any review tool, is to inflate findings to look thorough. The guardrail in the prompt — “Do not invent stricter requirements than the written rules” — pushes back against that.
The “Notes” section captures ambiguity. Where the rules are silent, the subagent says so. It doesn’t pretend the silence is a finding. It just notes the gap. I read those notes and decide whether to update the rules themselves.
The output is short. No introduction, no recap of the change, no encouragement, no closing pleasantry. Severity headers, numbered findings, citations, notes. Everything else got cut.
What this catches that I wouldn’t
Three categories of finding consistently come back from this subagent that I’d have missed otherwise.
The thing I forgot to do. A token without an expiration. A log call that includes a credential. A query parameter that holds a session ID. These are all individually obvious, and I individually forget at least one of them per non-trivial change. The subagent doesn’t get tired.
The thing the rules say but I don’t remember. Some rules have specific numeric requirements — minimum password length, maximum token lifetime, required headers. I cannot keep all of those in my head. The subagent reads the rule, sees the number, and tells me when the implementation is off.
The thing nobody noticed before. Sometimes a finding the subagent surfaces points at a real gap in the existing code, not the change. Those are the most valuable findings, because the human review process had been letting them through. The subagent has no incentive to politely overlook them.
What it doesn’t catch
It’s not magic.
Logic-level vulnerabilities. A correctly-implemented auth flow that has a flawed business-logic premise — “anyone with the URL can upgrade their plan” — won’t show up. The rules don’t cover business logic; the subagent won’t invent rules. Logic review is still on humans.
Design-level issues. “You shouldn’t be storing this in a database at all” is a design conversation. The subagent will check that the storage you chose follows the rules; it won’t tell you the design is wrong.
Things the rules don’t cover. The subagent’s only superpower is faithfully consulting the rules. If the rules are missing a category, the subagent will be silent on that category. The remedy isn’t to ask the subagent to be smarter; it’s to write the missing rule.
Why this works as a subagent specifically
I want to be explicit about why this is a subagent and not a rule, a skill, or a function call.
It can’t be a rule. Rules attach to file edits and steer generation. This subagent reviews finished work; it isn’t generating code. The trigger is wrong.
It could be a skill, but a skill runs in the parent’s context. Reading thirty rule files plus the change under review would saturate the parent’s window. By the time the review was done, the parent would be useless for the rest of the session.
It could be a function call to an external service, but writing one is a much bigger commitment than a Markdown file. And the rules already exist as Markdown documents that the subagent reads natively. Wrapping them in a service is adding a layer for no benefit.
The subagent is the right shape because the work is isolated, read-heavy, specialized, and produces a small structured return. Each of those properties points at the same conclusion: spawn a clean instance with its own context, give it a focused prompt, get a report back.
That’s the whole pattern. The instance isn’t “smarter than the parent.” It’s the same model in a more focused configuration. The configuration is what does the work.