
You're building a multi-agent system in Claude Code, and you've realized something crucial: not every agent needs the same brainpower. Some tasks need Opus-level reasoning; others run circles faster with Haiku. Some agents should wield every tool in the toolkit; others need safety guardrails. This is where agent frontmatter becomes your secret weapon.
Agent frontmatter—the YAML metadata sitting at the top of every .claude/agents/ file—is where you define the personality, capabilities, and constraints of each subagent. Get it right, and you unlock cost-effective, focused, reliable automation. Get it wrong, and you're bleeding tokens and introducing safety vulnerabilities. This article shows you exactly how to get it right. By the end, you'll understand how to configure agents that are fast, cheap, and reliable.
Table of Contents
- Why This Matters: The Hidden Cost of Poor Agent Configuration
- What Is Agent Frontmatter?
- Frontmatter Fields: Complete Reference
- `name` (Required)
- `description` (Required)
- `model` (Required)
- `tools` (Required)
- The System Prompt Body: Where Behavior Lives
- Your Constraints
- Your Approach
- Examples
- Your Core Principles
- Your Process
- Example Input
- Example Output
- When Writing Tests
- Advanced Patterns: Multi-Agent Orchestration
- Real-World Deployment: Configuration Checklist
- Monitoring and Iteration
Why This Matters: The Hidden Cost of Poor Agent Configuration
Agent configuration seems like a detail—just fill in some YAML fields and move on. But this detail compounds into major cost and reliability issues that affect every operation of your agent system.
An agent running on Opus when Haiku would suffice costs 3x more per request. Multiply that by 1000 requests per day, and you're spending $100/day on unnecessary compute. Over a month, that's $3000 in wasted spend. Over a year, that's $36,000—enough to hire another engineer. The opportunity cost is staggering. Better configuration choices could have kept that money for other projects.
Beyond the pure token economics, there's a reliability dimension that most teams miss. An agent with overly broad tool access can cause damage. An agent with permission to execute bash, write files, and access the web is one misunderstanding away from deleting files or leaking secrets. An agent with unclear instructions produces inconsistent results. You get output that's sometimes useful, sometimes wrong, requiring human review every time.
The psychological impact on your team matters too. When agents produce unpredictable results, teams lose confidence. "The agent broke something last time. Let me manually check its output." Suddenly, automation adds friction instead of removing it. The agent was supposed to save time but now requires supervision. Trust erodes. People stop using the agent. The entire automation investment becomes a sunk cost.
Beyond cost and reliability, there's the velocity impact. Agents configured correctly accelerate your workflow. They become trusted partners. You spawn them, collect their output, and move forward. Agents configured poorly slow you down. They require babysitting. Code review becomes paranoid: "Did the agent do something weird?" You end up spending more time managing agents than the time they saved.
The investment in getting frontmatter right—understanding model tradeoffs, defining tool boundaries, writing clear instructions—pays dividends every time that agent runs. For agents running hundreds or thousands of times, this investment compounds into massive returns. A $200 investment in configuring an agent correctly could save $10,000 annually in token costs and support overhead.
What Is Agent Frontmatter?
Every agent in Claude Code starts with structured metadata. Think of it as the agent's job description, skill set, and operating license all rolled into one. Here's the basic anatomy:
---
name: "Agent Display Name"
description: "What this agent does and when to use it"
model: "haiku"
tools: ["Read", "Write", "Bash"]
---
System prompt body starts here. This is where you define the agent's behavior, reasoning style, and specialty...The frontmatter is everything between the triple dashes (---). The body is the system prompt—the narrative instructions that shape how the agent thinks and acts. Both matter. Both require thoughtful design.
This article is your exhaustive reference for getting frontmatter right. We'll cover every field, every valid value, cost-performance tradeoffs, safety patterns, and real-world examples. By the end, you'll build agents that are fast, focused, and reliable.
Frontmatter Fields: Complete Reference
Let's walk through every field you can set, what it does, and when you need it. Understanding each field deeply will help you make good configuration decisions.
name (Required)
Type: String Purpose: Display name for the agent (used in logs, CLIs, team dashboards)
Rules:
- Must be unique across your
.claude/agents/directory - Keep it concise but descriptive (20-50 chars ideal)
- Use title case
- No special characters except hyphens
Example:
name: "Code Analyzer"
name: "Documentation Generator"
name: "Test Engineer"This field is what you see when you ask Claude Code to list available agents. It's also what appears in logs when the agent runs. Make it clear enough that a human can glance at a log and know which agent was involved. When you have fifty agents running across a complex pipeline, clear naming becomes critical for operational understanding. It's the difference between seeing "Agent-12-b73f" in a log and "Python Code Analyzer" in a log.
Pitfall: Don't use generic names like "Agent 1" or "Helper." Your future self will thank you when reading logs. Also avoid names that are too specific to implementation details—"Regex Pattern Matcher v2" is worse than "Email Validator" because the implementation detail (regex) might change but the task (email validation) doesn't. Names should reflect purpose, not implementation.
Real-world scenario: In a large system with 50+ agents, clear naming becomes critical. When you're debugging a production issue at 2 AM and the logs show "Agent XYZ ran for 45 seconds," you need to know immediately what XYZ does. "Python Code Analyzer" tells you everything. "Processor-B" tells you nothing and forces you to hunt through documentation.
description (Required)
Type: String Purpose: Human-readable explanation of what the agent does, when to use it, and its specialty
Rules:
- 1-3 sentences maximum
- Answer: What problem does it solve? When should it run?
- Be specific about scope (e.g., "write Python unit tests" vs. "write code")
- Include any prerequisites or assumptions
Example:
description: "Analyzes Python codebases for performance bottlenecks and suggests optimizations. Requires access to source files and build logs. Best for repos with 50K+ LOC."
description: "Generates Markdown documentation from TypeScript/JavaScript JSDoc comments. Filters for public API only. Fast, focused, zero customization."
description: "Validates JSON schema compliance and reports violations. Stateless. Safe to run in parallel on large datasets."Good descriptions help you (and your team) choose the right agent for a task. A vague description leads to agent misuse. When you're orchestrating multiple agents, you need to know at a glance which one is suited for the job. A description should answer: "Can this agent do what I need?" in one glance.
Pitfall: Don't include the agent's reasoning process in the description—just what it does and why you'd use it. Don't say "This agent thinks carefully about edge cases and uses deep reasoning." Say "This agent identifies performance bottlenecks in Python code by analyzing algorithm complexity and I/O patterns." Be concrete about the work, not the thinking.
Gotcha: If your description is wrong or misleading, people will use the agent for the wrong tasks. I've seen teams waste hours because they thought an agent could do something, the description suggested it could, but the actual implementation couldn't. Be precise. Test your descriptions with people who haven't seen the agent before.
model (Required)
Type: String (enum)
Valid Values: "haiku" | "sonnet" | "opus"
Purpose: Which Claude model runs this agent
This is critical. Your choice here directly impacts:
- Cost (Haiku ~1/6 the cost of Sonnet, Opus ~3× Sonnet)
- Speed (Haiku fastest, Opus slowest)
- Reasoning depth (Haiku good, Sonnet very good, Opus excellent)
- Context window (all equal at 200K; reasoning capability differs)
The model choice is the single most important configuration decision you'll make. It drives both cost and quality. Choose wrong and you're either overpaying for capability you don't need or underpaying and getting poor results.
Model Selection Matrix:
| Task | Recommended | Why |
|---|---|---|
| Simple parsing, formatting, regex | Haiku | Fast, cheap, sufficient |
| Code generation, bug fixes, analysis | Sonnet | Good balance of capability and cost |
| Complex reasoning, multi-step problems | Opus | Reasoning depth matters |
| Fact-checking, validation | Haiku | Pattern matching, no creativity needed |
| Content creation, writing | Sonnet or Opus | Quality matters more than speed |
| Summarization, extraction | Haiku | High volume, low cognitive load |
| Architectural decisions, design review | Opus | Complex tradeoffs require deep thinking |
| Parallel validation tasks | Haiku | Cost efficiency for batch operations |
Real-World Example:
You're building a system with five agents. Let's think through model selection:
- Code Formatter → Haiku (deterministic, rule-based, no reasoning needed)
- Bug Detector → Sonnet (needs reasoning but not rare edge cases)
- System Architect → Opus (complex design decisions, multiple tradeoffs)
- Test Generator → Sonnet (balancing coverage and readability, moderate complexity)
- Documentation Scraper → Haiku (just extracting and formatting, no creativity)
By matching models to task complexity, you cut costs by ~40% while maintaining quality. Haiku handles four of five agents. Opus handles one. Sonnet handles two. Your average cost per agent is low because most are cheap.
Cost Math:
As of early 2026:
- Haiku: ~$0.80 per million input tokens, ~$4 per million output tokens
- Sonnet: ~$3 per million input tokens, ~$15 per million output tokens
- Opus: ~$15 per million input tokens, ~$75 per million output tokens
For a task processing 1M tokens of input, choosing Haiku saves you ~$2.20 vs. Sonnet. Multiply that across 100 tasks per day, and you're looking at real dollars—maybe $200-300/day that stays in your budget. Over a month, that's $6,000-9,000 in compute you didn't need to spend. Scale that across an organization running hundreds of agents daily, and you're talking about hundreds of thousands of dollars annually. The cost difference compounds.
The hidden cost of wrong choices: If you default to Opus for everything, you might spend $5,000/month on agent processing that could cost $1,500 with better selection. If you cheap out on Sonnet and use Haiku for code generation, you might get lower-quality output that needs rework, negating any savings. The sweet spot requires deliberate matching of model to task.
Pitfall: Don't default to Opus for everything. It's tempting ("maximum quality!"), but you're paying for reasoning depth you don't always need. Start with Haiku, bump up to Sonnet if it fails validation, and reserve Opus for genuinely complex problems. Also, don't assume "more expensive = better." Haiku is perfectly fine at what it does. The question is whether your task needs what Sonnet or Opus provides.
Troubleshooting guide:
- If your Haiku agent is hallucinating answers it doesn't know, it might be too weak for the task. Try Sonnet.
- If your Sonnet agent is slow and you have a high-volume workflow, try Haiku with clearer constraints.
- If your Opus agent is taking 30+ seconds per task, reconsider whether you really need that reasoning depth.
tools (Required)
Type: Array of strings Purpose: Restrict which tools the agent can access
This is your safety and focus lever. By limiting tool access, you:
- Reduce hallucination risk (agent can't invent capabilities it doesn't have)
- Enforce focus (agent stays in its lane)
- Improve safety (no accidental writes, deletions, or external calls)
- Speed up execution (less overhead, faster decisions)
Available Tools:
Here's the full roster. Include only what the agent needs:
tools:
- "Read" # Read files locally
- "Write" # Write files locally
- "Edit" # Edit files (targeted replacements)
- "Bash" # Execute bash commands
- "Glob" # Fast file pattern matching
- "Grep" # Content search with regex
- "Agent" # Spawn child agents (orchestration)
- "WebSearch" # Search the web, ground in current info
- "WebFetch" # Fetch and analyze URLs
- "Skill" # Invoke saved skills/commandsTool Access Matrix (By Agent Type):
| Agent Type | Typical Tools | Rationale |
|---|---|---|
| Code Analyzer | Read, Glob, Grep, Bash | Reads code, searches patterns, runs tests |
| Code Generator | Read, Write, Edit, Bash | Writes new files, modifies existing, validates |
| Documentation | Read, Write, WebFetch | Reads source, writes docs, fetches references |
| Validator | Read, Glob, Grep, Bash | Inspects, searches, runs validation scripts |
| Researcher | WebSearch, WebFetch | Needs web access only |
| Orchestrator | Agent | Spawns child agents only |
Safety Pattern: Deny-by-Default
Start restrictive. Add tools only when the agent clearly needs them:
# ❌ WRONG: Too permissive
tools: ["Read", "Write", "Edit", "Bash", "Glob", "Grep", "Agent", "WebSearch", "WebFetch"]
# ✅ CORRECT: Only what's needed
tools: ["Read", "Glob", "Grep", "Bash"]This agent reads files, searches for patterns, and runs bash. It can't write, modify, or access the web. Safe. Focused. The agent will be faster because it has fewer decisions to make about what tools to use. It will be cheaper because unused tools don't get invoked. It will be safer because it can't do things outside its intended scope.
Example: Build a Specialized Validator
name: "Schema Validator"
description: "Validates JSON/YAML files against schemas. Reports violations. Read-only."
model: "haiku"
tools: ["Read", "Glob", "Grep"]This agent can't write, can't run bash, can't access the web. It just reads and searches. Perfect for a validator—you're guaranteed no side effects. If the validator runs against the wrong files by mistake, the worst that happens is it reports on files you didn't intend. It can't delete them, corrupt them, or leak them.
The "Write" vs. "Edit" distinction:
Writecreates new files or completely overwrites existing ones. Powerful but dangerous.Editmakes targeted replacements in existing files. Safer because it requires matching exact content.
If your agent just needs to modify a few lines in existing files, use Edit and deny Write. If it needs to create new files from scratch, include both. The distinction matters for safety.
Pitfall: Including Bash without strict guardrails is dangerous. If an agent can execute arbitrary shell, it can delete files, corrupt databases, or leak secrets. Always pair Bash with a safety system prompt that forbids destructive commands. Add PreToolUse hooks that block dangerous patterns. Never give an agent shell access and expect it to be safe on its own.
Dangerous combinations to scrutinize:
Bash+Write+ unrestricted system prompt = can do anythingWebFetch+Agent= can fetch data and pass it to other agents unsupervisedEdit+ any write tool + loose constraints = can modify critical files
The System Prompt Body: Where Behavior Lives
After the frontmatter comes the body—your system prompt. This is where you define the agent's personality, reasoning style, constraints, and specialty logic. The frontmatter defines what the agent is allowed to do. The system prompt defines what the agent should do.
Structure:
---
name: "My Agent"
description: "Does X, used for Y"
model: "haiku"
tools: ["Read", "Bash"]
---
You are a specialized [domain] agent focused on [specific task].
## Your Constraints
- Never [action]
- Always [requirement]
- When unsure, [behavior]
## Your Approach
1. [First step]
2. [Second step]
3. [Validation step]
## Examples
[Show expected input/output]
When you encounter [scenario], [action].System Prompt Best Practices:
-
Start with role clarity: "You are a Python unit test generator focused on pytest. You prioritize readability and coverage over cleverness." The agent needs to know its identity and values.
-
Define constraints early: List what the agent must NOT do. Haiku benefits from explicit guardrails. Be specific: "Never use eval(). Never import entire modules with *. Never write tests that depend on external services." Vague constraints don't work—be explicit about what's off-limits.
-
Show examples: Concrete input/output examples anchor behavior. Don't describe, show. Show a function signature, show what good tests look like for that function, explain why they're good.
-
Explain the "why": "Output should include helpful assertion messages because tests fail in CI logs, not IDEs. When a test fails in CI, the developer is flying blind without the assertion message." Understanding the reasoning helps the agent make good decisions in edge cases.
-
Call out edge cases: "If the function has no return value, skip assertions and focus on side effects. If the function is async, wrap it in asyncio.run() in tests." Edge cases are where agents struggle. Being explicit helps.
Here's a real example:
---
name: "Python Test Generator"
description: "Generates pytest unit tests from Python source. Focuses on readability and edge cases."
model: "haiku"
tools: ["Read", "Write", "Bash"]
---
You are a pytest expert generating comprehensive unit tests for Python functions.
## Your Core Principles
- Test behavior, not implementation
- Use descriptive test names that read like documentation (e.g., test_parse_int_with_leading_zeros_returns_int)
- Include edge cases: None values, empty collections, negative numbers, zero, maximum values
- Prioritize readability over brevity
- Never use pytest.mark.skip or xfail—if a test is bad, don't write it
- Never import random or use time-dependent tests
- Assume pytest is available and uses standard assert syntax
## Your Process
1. Read the source file completely
2. Identify all public functions (not prefixed with _)
3. For each function:
a. Extract signature and docstring
b. Identify normal cases, edge cases, error cases
c. Write 3-5 tests per function
4. Validate syntax with `pytest --collect-only` before returning
## Example Input
```python
def parse_int(value: str) -> int:
"""Parse string to int. Raises ValueError if invalid."""
return int(value)Example Output
def test_parse_int_with_valid_positive_integer():
assert parse_int("42") == 42
def test_parse_int_with_negative_integer():
assert parse_int("-10") == -10
def test_parse_int_with_leading_zeros():
assert parse_int("007") == 7
def test_parse_int_with_invalid_string_raises_error():
with pytest.raises(ValueError):
parse_int("abc")
def test_parse_int_with_floating_point_string_raises_error():
with pytest.raises(ValueError):
parse_int("3.14")When Writing Tests
- If the function handles errors, test both success and error paths
- If the function has defaults, test with and without defaults
- If the function touches files/databases, mock them using unittest.mock.patch
- If the test requires a fixture, skip it and note why in a comment: # TODO: requires fixture setup
- Document non-obvious test logic with comments
This system prompt is explicit. The agent knows exactly what to do, why, and where to stop. Compare this to a vague prompt like "write good tests"—night and day difference in output quality.
## Advanced Patterns: Multi-Agent Orchestration
When you have multiple agents working together, frontmatter coordination becomes essential. Each agent needs clear boundaries about what it's responsible for.
```yaml
# .claude/agents/orchestrator.yaml
---
name: "Code Quality Orchestrator"
description: "Coordinates analysis and fixing of code quality issues across a Python project. Dispatches specialized agents."
model: "sonnet"
tools: ["Agent", "Read", "Glob"]
---
You are a code quality orchestrator. Your job is to coordinate multiple specialized agents:
- The Python Linter Agent: checks style and basic issues
- The Type Checker Agent: validates type annotations
- The Test Generator Agent: generates missing tests
- The Documenter Agent: generates docstrings
When given a repository:
1. Use Glob to identify all Python files
2. Dispatch the Linter Agent to check all files
3. Based on results, dispatch appropriate agents
4. Coordinate their outputs into a comprehensive report
```
This orchestrator agent knows what other agents exist and when to call them. It doesn't try to do everything itself. The specialization makes each agent fast and reliable.
## Real-World Deployment: Configuration Checklist
Before deploying an agent to production, verify:
```yaml
Pre-Deployment Checklist:
- [ ] Agent name is unique and descriptive
- [ ] Description answers "when should I use this?"
- [ ] Model choice justified (document your reasoning)
- [ ] Tool list is minimal (deny-by-default applied)
- [ ] System prompt has role clarity
- [ ] System prompt lists constraints explicitly
- [ ] Examples are concrete and realistic
- [ ] Tested with 5+ diverse inputs
- [ ] Output quality meets expectations
- [ ] Failure modes documented
- [ ] Cost estimate calculated
- [ ] Version number assigned
```
Following this checklist prevents configuration mistakes that cause problems later. The investment of 30 minutes verifying configuration saves hours of debugging production issues.
## Monitoring and Iteration
After deployment, track agent performance:
```yaml
Metrics to Monitor:
- Success rate (% of runs that complete without error)
- Average execution time (seconds per run)
- Cost per run (input tokens + output tokens * cost rate)
- Quality score (subjective: 1-5 rating)
- Error rate by error type (timeouts, validation failures, etc.)
```
When metrics drop, investigate the frontmatter. Did something change? Is the model still appropriate? Are the tools still correct?
---
**-iNet**