paulund
#ai #claude #agents #sub-agents

Sub-Agents in AI Coding Assistants

Sub-agents are a core architectural pattern in modern AI coding tools such as Claude Code. Rather than handling every part of a task in a single conversation turn, the system breaks work down and delegates each piece to a specialised agent. Each sub-agent has its own focus, its own set of tools, and, in some cases, its own choice of underlying model.

How Sub-Agents Work

When you give Claude Code a high-level instruction -- say, "add a new dashboard page" -- it does not try to do everything at once. Instead, it orchestrates a series of sub-agents, each responsible for a distinct phase of the work:

  1. Exploration -- An Explore agent reads your codebase to understand the existing structure, conventions, and relevant files. It answers questions like "what patterns do the current pages follow?" without modifying anything.
  2. Planning -- A Plan agent takes the information gathered by the Explore agent and designs a concrete implementation approach. It produces a step-by-step plan that you can review before any code is written.
  3. Implementation -- Once you approve the plan, one or more agents carry out the actual work: creating files, writing code, running commands, and executing tests.
  4. Validation -- A Test Runner or Build Validator agent checks that the implementation actually works, running your test suite and flagging any failures.

Each of these agents runs autonomously within its own scope. You do not need to manually hand off work between them -- the orchestrating system manages that.

Common Sub-Agent Types

The following agent types appear frequently in Claude Code workflows:

  • Bash Agent -- Executes shell commands and performs terminal operations such as running builds, installing dependencies, or querying the file system.
  • Explore Agent -- Searches and analyses codebases to understand structure, find relevant files, and gather context. It does not modify anything.
  • Research Agent -- Gathers information from external sources, documentation, APIs, or the web to inform decisions.
  • Plan Agent -- Designs implementation strategies and breaks complex work into concrete, ordered steps.
  • Test Runner -- Executes tests and reports results, flagging failures for the developer or the orchestrating agent to address.
  • Build Validator -- Compiles code and validates that the build succeeds before changes are considered complete.

When Are Sub-Agents Useful?

Sub-agents become particularly valuable when the task at hand is too large or too complex for a single pass. A few common scenarios:

  • Multi-file features -- When a new feature touches several files across different layers of the application, sub-agents can explore, plan, and implement each piece in the right order.
  • Research-heavy work -- When you need to understand how something works before you can build it, an Explore or Research agent can gather that context first, saving you time.
  • Code review -- A review workflow can dispatch one agent to fetch the pull request, another to run static analysis, and a third to reason about code quality -- each with an appropriate model for its job.
  • Long-running tasks -- Tasks that would take many manual steps (such as setting up a new service or refactoring a module) benefit from being broken into smaller, autonomous chunks.

Model Selection Within Sub-Agents

One of the advantages of the sub-agent model is that each agent can use a different Claude model depending on the complexity of its task. An Explore agent doing a straightforward file search might use Haiku for speed, whilst a Plan agent reasoning about architecture might use Opus for deeper analysis. This keeps workflows fast and cost-effective without sacrificing quality where it matters.

Sub-Agents vs. Single-Turn Prompts

It is worth understanding the distinction between a single-turn prompt and a sub-agent workflow:

Approach Best for Limitation
Single-turn prompt Small, well-defined tasks Struggles with tasks that require exploration or multi-step reasoning
Sub-agent workflow Complex, multi-phase tasks Slightly more overhead to set up, but handles large tasks far more reliably

For most non-trivial development work in 2026, sub-agent workflows are the default. They give you better visibility into what the AI is doing at each stage, make it easier to course-correct, and produce more consistent results than asking a single model to do everything in one go.

Creating Custom Agents for Your Projects

You can define custom agents in your .claude/agents/ directory to automate workflows specific to your project. Each agent is a YAML-frontmatter markdown file that describes:

  • Purpose: What this agent does and when to use it
  • Capabilities: Which tools and skills it has access to
  • Workflow: Step-by-step process it follows
  • Output: What it delivers (files saved, commands run, etc.)

Agent Design Principles

Keep agents focused -- Each agent should handle one primary responsibility. An agent for code review should do code review, not also run tests and push to main.

Reference skills, don't duplicate -- If your agent needs writing guidelines, version control practices, or testing conventions, store those in skills and reference them. This keeps agents lean and makes guidelines reusable.

Define clear input and output -- Specify exactly what the agent reads (file paths, arguments, environment state) and what it produces (files created, state changes, side effects). This makes agents composable and debuggable.

Use appropriate models -- Lightweight agents doing straightforward tasks (file searches, command execution) can use Haiku for speed. Complex reasoning tasks (architecture design, code review) benefit from Opus or Sonnet.

Provide examples in descriptions -- Include concrete examples of when and how to invoke the agent. This helps both humans and orchestration systems use it correctly.

Writing Effective Agent Prompts

The difference between an agent that works reliably and one that produces inconsistent results is almost always the prompt. Not the model, not the tools, not the infrastructure -- the clarity and structure of the instructions.

A reliable agent prompt needs four things. A fifth -- persona -- is occasionally useful but no longer essential with current models.

1. Context

What does the agent have access to? What does it know about the environment it's operating in?

This is the most important element. Modern models infer a lot from good context -- the appropriate reasoning style, the level of strictness, even what kind of expert perspective to take. Be specific about your project, conventions, tools, and constraints. Generic context produces generic output.

**Context:**
- You are working in a Laravel 12 application using PHP 8.3
- Testing framework: Pest 3. All tests use snake_case methods.
- Code style: PSR-12 enforced via Laravel Pint
- Static analysis: PHPStan level 5 via Larastan
- Existing services follow a single-action invokable class pattern

2. Task

One specific thing to do. Not two things. Not "do X and also Y." One task.

This is the element people most often get wrong. Combining tasks in a single agent prompt degrades quality across all of them. If you need both implementation and tests, ask for implementation first, review it, then ask for tests. Or use separate agents for each.

**Task:** Review the provided service class for the following and report issues only -- do not suggest fixes:
- Type safety and PHPStan compliance at level 5
- Missing or incorrect docblocks
- Violations of the single responsibility principle
- Potential null reference issues

3. Output format

Exactly what you want back and in what form. If the agent can return its output in multiple ways -- prose, JSON, markdown, a list -- specify which one. If you need a specific structure, define it.

Unclear output format instructions produce inconsistent output, which makes downstream processing harder.

**Output format:**
Return a numbered list of issues found. For each issue:
- Line number or method name where the issue occurs
- Issue category (type safety / docblock / SRP / null reference)
- Brief description of the specific problem

If no issues are found, return: "No issues found."
Do not include suggestions, explanations, or context -- issues only.

4. Constraints

What should the agent not do? What's out of scope? What should it refuse or flag rather than handle?

Constraints prevent scope creep and catch the common failure modes where an agent helpfully goes beyond its brief in ways that create problems.

**Constraints:**
- Do not modify any files -- report only
- Do not suggest architectural changes outside the reviewed class
- If you encounter code you cannot evaluate (e.g., a macro or facade not in context), flag it rather than guessing
- Do not add inline comments to the code

5. Persona (optional)

In 2026, current models infer the appropriate expert perspective from context and task alone. Telling a model "review this for PHPStan level 5 compliance" already implies what kind of reviewer it should be. You don't need to spell it out separately.

Where persona still helps is when you want to deliberately restrict the model's angle -- a security reviewer who ignores UX concerns, a junior-friendly writer who avoids jargon. In those cases, a short persona line is worth adding. Otherwise, skip it.

**Persona:** You are a strict security reviewer. You do not comment on code style,
performance, or architecture unless they have a direct security impact.

Common Prompt Mistakes

Too many tasks in one prompt -- The most frequent mistake. "Review this code, fix any issues, write tests, and update the documentation" sounds efficient. In practice, it produces mediocre output on all four tasks. A specialist doing one thing well beats a generalist doing four things adequately.

Underspecified context -- The model doesn't know your codebase. It doesn't know your conventions, your tech stack version, your team's preferences, or your project's constraints unless you tell it. A prompt that says "review this code" will produce generic advice. A prompt that says "review this code against our Pest 3 testing conventions, PHPStan level 5, and the single-action controller pattern we use throughout" will produce specific, actionable feedback. Context is where most of the quality difference comes from.

No output format -- If you don't specify what you want back, you get whatever the model defaults to. Sometimes that's useful. Often it's a flowing prose explanation when you wanted a list, or a list when you wanted prose. Specify the format.

Missing constraints -- Without constraints, agents tend to be helpful in ways you didn't ask for. They fix things you only wanted reported, suggest architectural changes when you wanted a narrow review, or add explanations when you wanted raw output. Constraints are cheap to write and save a lot of cleanup.

When Prompts Aren't Enough

Well-structured prompts solve the clarity problem. They don't solve every problem.

If your agent consistently fails on a specific type of task despite clear prompts, the issue might be a model capability limitation -- some tasks require reasoning or knowledge the model doesn't reliably have. In that case, consider breaking the task down further or providing reference examples (few-shot prompting).

If the agent produces inconsistent results across runs, the prompt is probably under-constrained -- there are multiple valid interpretations of the instructions, and the model picks differently each time. Tighten the output format and constraints.

If the agent works well in isolation but poorly in a multi-agent pipeline, the handoff definitions need work. The output format of one agent needs to match the input expectations of the next.

Example: Content Creation Agents

A practical example from Paulund: three agents for managing social content.

LinkedIn Content Drafter focuses on one responsibility: turning raw ideas into LinkedIn posts. It:

  • Reads ideas from content/ideas/
  • References two skills: LinkedIn style guide and professional brand guidelines
  • Saves output to content/drafts/ with a -linkedin.md suffix
  • Uses a clean workflow: read idea -> determine format -> write content -> save

X Content Drafter does the same for Twitter/X:

  • Reads from content/ideas/
  • References the X style guide
  • Saves to content/drafts/ with an -x.md suffix

Tech Article Writer creates blog articles:

  • Reads from content/ideas/
  • References blog style guide and technical writing principles
  • Saves to content/drafts/ with a -blog.md suffix

Each agent is focused, references reusable guidance (skills), and produces well-defined outputs. This makes them easy to invoke, understand, and maintain.