
You've got 100 files to lint, 50 test suites to run, or 1000 API endpoints to document. Do you send Claude Code to grind through them sequentially? That's one approach—and it'll work. But it'll also take forever, waste your time, and leave your local resources underutilized.
What if instead you spawned 10 agents in parallel, each handling a slice of the work? They finish their pieces simultaneously, you aggregate the results, deduplicate overlaps, and deliver a comprehensive report in a fraction of the time. That's an agent swarm—a powerful pattern in Claude Code that transforms bottleneck tasks into massively parallel operations.
This article is for advanced Claude Code users who are ready to move beyond single-agent workflows. We'll explore what swarms are, why they matter, how to architect them, and how to implement them with real code. By the end, you'll understand fan-out/fan-in patterns, partitioning strategies, resource management, and how to avoid the pitfalls that trip up newcomers.
Let's build something fast.
Table of Contents
- What Is an Agent Swarm?
- The Fan-Out/Fan-In Pattern
- Why Parallel Processing Matters
- Problem 1: Sequential Processing Is Slow
- Problem 2: Single Agents Miss Context
- Problem 3: API Rate Limits Require Careful Concurrency
- Problem 4: Resource Isolation
- Partitioning Strategies
- Strategy 1: Partition by File
- Strategy 2: Partition by Module/Directory
- Strategy 3: Partition by Concern
- Strategy 4: Partition by Test Suite
- Strategy 5: Partition by API Endpoint
- Implementing a Swarm: Full Example
- Step 1: Define Your Work Partitions
- Step 2: Create an Agent Task Command
- Step 3: Dispatch Agents in Parallel
- Step 4: Managing Concurrency Limits
- Deduplication and Result Aggregation
- Resource Management and Best Practices
- Pitfall 1: Spawning Too Many Agents at Once
- Pitfall 2: Not Tracking Agent Status
- Pitfall 3: Over-Partitioning
- Pitfall 4: Ignoring Memory/Disk
- Pitfall 5: No Failure Recovery
- Monitoring and Observability
- Advanced: Scaling to Hundreds of Agents
- Summary
- Real-World Case Studies
- Case Study 1: Parallel Code Review for 10,000 Files
- Case Study 2: Scaling Test Execution
- Case Study 3: Documentation Generation at Scale
- The Economics of Swarms
- Measuring Swarm Performance
- Common Mistakes and How to Avoid Them
- Mistake 1: Using Swarms Too Early
- Mistake 2: Over-Partitioning
- Mistake 3: Assuming All Partitions Finish Simultaneously
- Mistake 4: Forgetting to Handle Partial Failure
- Mistake 5: Creating Tight Coupling Between Agents
- Next Steps
- The Cultural Shift: From Sequential Thinking to Parallel Thinking
- The Sustainability Question: When to Use Swarms
- The Long Game: Building Expertise
- Final Thoughts: Building a Swarm-Based Culture
What Is an Agent Swarm?
An agent swarm is a coordinated group of independent agents working on partitioned pieces of the same task in parallel. Think of it like a construction crew: instead of one person building an entire house, you have electricians, plumbers, framers, and painters all working on different sections at the same time.
In Claude Code terms:
- Swarm = multiple agents executing concurrently
- Partition = divide work into independent units (files, modules, test suites, API endpoints)
- Fan-out = dispatch agents to handle partitions
- Fan-in = collect and aggregate results
- Deduplication = merge overlapping findings (two agents finding the same issue)
- Concurrency limit = cap simultaneous agents to avoid API rate limiting
The core benefit is velocity. If linting 100 files sequentially takes 100 seconds, running 10 agents in parallel (each handling 10 files) takes roughly 10 seconds. The speedup is near-linear up to resource and API limits.
But swarms aren't just about speed. They're also about parallelizing cognitive work—letting multiple specialized agents think independently on different concerns, then synthesizing their findings. A swarm of agents can catch issues, inconsistencies, and opportunities that a single agent might miss. One agent might see a performance issue, another a security vulnerability, and a third a maintainability problem—all in the same code. Together, they're more thorough.
The real-world impact is profound: what would have taken an expert 8 hours of sequential analysis now takes 45 minutes with a swarm of agents running in parallel. That's not just efficiency—it's a fundamental shift in how you think about large-scale code analysis and automation tasks.
Swarms also improve consistency. When you run a single agent against a massive codebase, it can lose context or become fatigued (in a metaphorical sense). Multiple specialized agents, each handling a focused partition, maintain consistent quality throughout. Each agent has fresh context for its partition, applies the same standards consistently, and doesn't degrade in performance as the workload increases.
The Fan-Out/Fan-In Pattern
The core pattern of agent swarms is fan-out/fan-in. It's beautifully simple:
- Fan-out: Partition work into N chunks. Spawn N agents, each handling one chunk.
- Parallel execution: All agents work simultaneously.
- Fan-in: Collect results from all agents. Aggregate, deduplicate, synthesize.
Here's the conceptual flow:
Task (1 unit of work)
↓
Partition (divide into N chunks)
↓
Fan-out (spawn N agents)
├→ Agent 1 (chunk 1) → Result 1
├→ Agent 2 (chunk 2) → Result 2
├→ Agent 3 (chunk 3) → Result 3
└→ Agent N (chunk N) → Result N
↓
Fan-in (collect & aggregate)
↓
Unified Report
In practice, this pattern is implemented using background task dispatch and result collection. Claude Code's run_in_background parameter lets you kick off agents without waiting for them to finish. You dispatch them all, then collect results when they're ready.
The mathematical elegance of this pattern becomes clear when you think about latency. If you have 100 items to process and each takes 10 seconds sequentially, that's 1000 seconds total. With 10 agents in parallel, you're down to 100 seconds. With 20 agents, you're at 50 seconds. The relationship is linear until you hit API rate limits or local resource constraints. This scalability is why swarms are so powerful for high-volume tasks.
The fan-out/fan-in pattern also solves a critical problem: failure isolation. When you run a single agent against a massive workload, one failure means starting over from scratch. With swarms, if 9 agents finish and 1 fails, you've already captured 90% of your results. You retry just the failed partition, not the entire job. This resilience is especially valuable in production systems where re-running a full analysis might be expensive.
Furthermore, the cognitive benefits of this pattern are underrated. When Agent 1 analyzes files 1-10, it can build up specialized context about that subset. It learns the patterns, the quirks, the edge cases specific to those files. When Agent 2 analyzes files 11-20, it develops independent expertise about its subset. Later, during aggregation, you get two perspectives—two different specialized views of the same problem space. This diversity of perspective catches issues that a single agent would miss.
The pattern also naturally supports intelligent prioritization. You might run fast, cheap agents (Haiku) on straightforward partitions and reserve expensive agents (Opus) for complex analyses. The swarm architecture doesn't care—each agent runs independently. You can even adjust partition sizes based on expected complexity. Small, fast partitions for simple files; larger, more complex partitions for intricate code.
Why Parallel Processing Matters
Before we dive into implementation, let's talk about why you'd want swarms in the first place.
Problem 1: Sequential Processing Is Slow
If you're running a linter against 100 files, and each file takes 1 second to lint, sequential processing takes 100 seconds. Parallel processing with 10 agents cuts that to 10 seconds. That's a 10x speedup just from parallelization.
In real-world workflows—analyzing codebases, running test suites, documenting APIs, validating configurations—sequential processing becomes a bottleneck. Swarms unblock it. And the speedup is nearly linear up to your API quota and local resource limits. Double your agents, roughly halve your time (until you hit rate limits).
But the value isn't just about raw speed. It's about unblocking your team. A developer waiting 90 seconds for code analysis feedback loses context, switches tasks, and struggles to resume. Give them 9 seconds and they stay focused. That's a qualitative difference in developer experience that translates to productivity.
Problem 2: Single Agents Miss Context
A single agent processing all 100 files might miss subtle inconsistencies or patterns because it's overwhelmed by scale. Multiple agents working on smaller chunks can spot issues more clearly, then share findings during aggregation. This is cognitive parallel processing—each agent can focus more deeply on its partition, catching issues a context-overloaded agent would miss.
Consider a code review scenario. One agent reviewing 10 files can provide detailed, contextual feedback because it has focused attention. That same agent reviewing 100 files might provide surface-level feedback because it's cognitively overloaded. By partitioning into 10 agents of 10 files each, you get 10 sets of focused, high-quality analysis instead of 1 set of diluted analysis. The aggregate result is more thorough.
Problem 3: API Rate Limits Require Careful Concurrency
Claude Code uses the Anthropic API, which has rate limits. If you spawn 100 agents at once, you'll hit rate limits fast. Concurrency caps let you control parallelism—spawn 10 agents, wait for results, spawn 10 more—without burning through quota or triggering backoff. This is critical for sustainable swarms.
Rate limiting is a feature, not a bug. It forces you to architect thoughtfully. If you blindly spawn 1000 agents, you're not parallelizing—you're thrashing. By capping concurrency and executing in batches, you maintain throughput without triggering API throttling. Your swarm stays efficient even under load.
Problem 4: Resource Isolation
When agents work in isolation (each on a partition), they don't interfere with each other. One agent's memory explosion doesn't crash siblings. One agent's bug doesn't poison the entire run. Isolation is a feature—it makes swarms resilient.
This resilience is crucial for production systems. If you run a single agent against a critical codebase and it crashes halfway through, you've lost an hour's work. If you run 10 agents and one crashes, you've lost 10% of the work and can retry just that partition. That's the difference between "we need to restart this whole thing" and "we'll retry this one chunk."
Partitioning Strategies
The success of a swarm depends entirely on how you partition work. Bad partitioning = wasted parallelism. Good partitioning = real speedup.
Here are the main strategies:
Strategy 1: Partition by File
Perfect for linting, formatting, validation—any task where work per file is independent.
When to use: Analyzing codebases, linting, formatting, static analysis.
Example: 100 JavaScript files → spawn 10 agents, each linting 10 files.
# Partition by file
files=$(find . -name "*.js" | sort)
total_files=$(echo "$files" | wc -l)
chunk_size=$((total_files / 10)) # 10 agents
partition_index=0
for file in $files; do
agent_index=$((partition_index / chunk_size))
echo "$file" >> "partition_$agent_index.txt"
((partition_index++))
done
# Dispatch 10 agents, each handling one partition
for i in {0..9}; do
/dispatch lint-agent-files < partition_$i.txt &
done
waitStrategy 2: Partition by Module/Directory
Useful when your codebase is organized into modules and each module can be analyzed independently.
When to use: Multi-module projects, microservices, plugin architectures.
Example: 5 modules → spawn 5 agents, each analyzing one module.
# Find all module directories
modules=$(find ./modules -maxdepth 1 -type d | grep -v "^\./modules$")
# Dispatch one agent per module
for module in $modules; do
/dispatch analyze-module "$module" &
done
waitStrategy 3: Partition by Concern
Abstract partitioning where you divide work by what agents analyze, not where it lives.
When to use: Composite analysis (security, performance, style), multi-dimensional validation.
Example: Code review → spawn agents for security, performance, style, maintainability.
# Partition by concern, not by file
/dispatch security-reviewer ./src &
/dispatch performance-analyzer ./src &
/dispatch style-checker ./src &
/dispatch maintainability-auditor ./src &
waitStrategy 4: Partition by Test Suite
For parallel testing, split test suites across agents.
When to use: Slow test runs, large test suites, CI/CD pipelines.
Example: 100 test files → spawn 5 agents, each running 20 tests.
# Find all test files and partition them
test_files=$(find . -name "*.test.js" -o -name "*.spec.js" | sort)
total_tests=$(echo "$test_files" | wc -l)
chunk_size=$((total_tests / 5))
partition_index=0
for test_file in $test_files; do
agent_index=$((partition_index / chunk_size))
echo "$test_file" >> "test_partition_$agent_index.txt"
((partition_index++))
done
# Dispatch 5 test agents in parallel
for i in {0..4}; do
/dispatch test-agent < test_partition_$i.txt &
done
waitStrategy 5: Partition by API Endpoint
For API documentation, testing, or validation tasks.
When to use: API exploration, endpoint validation, integration testing.
Example: 50 API endpoints → spawn 5 agents, each testing 10 endpoints.
Implementing a Swarm: Full Example
Let's build a complete working example: parallel code analysis. We'll spawn multiple agents to analyze different files in a codebase, each looking for different issues (bugs, style violations, performance anti-patterns), then aggregate the results.
Step 1: Define Your Work Partitions
First, identify what you're analyzing and how to partition it:
#!/bin/bash
# partition-files.sh - Split files into chunks for parallel analysis
SOURCE_DIR="${1:-.}"
NUM_AGENTS="${2:-10}"
TEMP_DIR="/tmp/claude-swarm-partitions"
# Create temp directory
mkdir -p "$TEMP_DIR"
rm -f "$TEMP_DIR"/*.txt
# Find all target files
echo "[*] Scanning for files in $SOURCE_DIR..."
files=$(find "$SOURCE_DIR" \
-type f \
\( -name "*.js" -o -name "*.ts" -o -name "*.py" -o -name "*.go" \) \
-not -path "*/node_modules/*" \
-not -path "*/.git/*" \
| sort)
file_count=$(echo "$files" | wc -l)
echo "[*] Found $file_count files"
if [ "$file_count" -eq 0 ]; then
echo "[-] No files found. Exiting."
exit 1
fi
# Calculate chunk size
chunk_size=$((file_count / NUM_AGENTS))
if [ "$chunk_size" -eq 0 ]; then
chunk_size=1
NUM_AGENTS="$file_count"
fi
echo "[*] Partitioning into $NUM_AGENTS agents (~$chunk_size files each)"
# Distribute files into partitions
partition_index=0
file_index=0
current_partition="$TEMP_DIR/partition_0.txt"
touch "$current_partition"
for file in $files; do
echo "$file" >> "$current_partition"
((file_index++))
# Create new partition when chunk is full
if [ "$file_index" -ge "$chunk_size" ] && [ "$partition_index" -lt $((NUM_AGENTS - 1)) ]; then
((partition_index++))
current_partition="$TEMP_DIR/partition_$partition_index.txt"
touch "$current_partition"
file_index=0
fi
done
echo "[+] Partitions created in $TEMP_DIR"
ls -la "$TEMP_DIR"/partition_*.txt | wc -l | xargs echo "[+] Total partitions:"Run it to create partitions:
./partition-files.sh ./src 10This creates 10 partition files, each listing ~10% of your source files.
Step 2: Create an Agent Task Command
Next, create a command that a single agent will execute against its partition:
#!/bin/bash
# analyze-partition.sh - Single agent analyzes its partition
PARTITION_FILE="$1"
AGENT_ID="$2"
OUTPUT_DIR="${3:-.}"
if [ ! -f "$PARTITION_FILE" ]; then
echo "[-] Partition file not found: $PARTITION_FILE"
exit 1
fi
echo "[Agent $AGENT_ID] Starting analysis..."
output_file="$OUTPUT_DIR/analysis_$AGENT_ID.json"
# Read files from partition and analyze each
{
echo "{"
echo " \"agent_id\": \"$AGENT_ID\","
echo " \"timestamp\": \"$(date -u +%Y-%m-%dT%H:%M:%SZ)\","
echo " \"files_analyzed\": [],"
echo " \"issues\": []"
echo "}"
} > "$output_file"
file_count=0
while IFS= read -r file; do
if [ -z "$file" ]; then
continue
fi
echo "[Agent $AGENT_ID] Analyzing: $file"
# Here's where you'd invoke Claude Code or a specialized skill
# For this example, we'll simulate analysis with basic tooling
# Count lines, complexity metrics, style issues
line_count=$(wc -l < "$file" 2>/dev/null || echo "0")
todo_count=$(grep -i "TODO\|FIXME\|HACK" "$file" 2>/dev/null | wc -l)
# In a real scenario, you'd invoke Claude Code here:
# /dispatch analyze-code-quality "$file"
((file_count++))
done < "$PARTITION_FILE"
echo "[Agent $AGENT_ID] Analysis complete. Processed $file_count files."
echo "[+] Results saved to: $output_file"This is a simplified example. In reality, you'd invoke Claude Code commands or skills here to do the actual analysis.
Step 3: Dispatch Agents in Parallel
Now, the critical part—dispatch all agents and run them in parallel:
#!/bin/bash
# run-swarm.sh - Dispatch and coordinate the agent swarm
SOURCE_DIR="${1:-.}"
NUM_AGENTS="${2:-10}"
TEMP_DIR="/tmp/claude-swarm-partitions"
OUTPUT_DIR="./swarm-results"
mkdir -p "$OUTPUT_DIR"
rm -f "$OUTPUT_DIR"/*.json
echo "[*] Starting Agent Swarm..."
echo "[*] Configuration:"
echo " Source: $SOURCE_DIR"
echo " Agents: $NUM_AGENTS"
echo " Output: $OUTPUT_DIR"
echo ""
# Step 1: Partition files
./partition-files.sh "$SOURCE_DIR" "$NUM_AGENTS"
# Step 2: Dispatch agents in parallel
echo "[*] Dispatching $NUM_AGENTS agents..."
start_time=$(date +%s)
for i in $(seq 0 $((NUM_AGENTS - 1))); do
partition_file="$TEMP_DIR/partition_$i.txt"
if [ ! -f "$partition_file" ]; then
echo "[-] Partition $i not found, skipping"
continue
fi
# Dispatch agent in background
(
./analyze-partition.sh "$partition_file" "$i" "$OUTPUT_DIR"
) &
echo "[+] Agent $i dispatched"
done
# Wait for all agents to complete
echo "[*] Waiting for agents to complete..."
wait
end_time=$(date +%s)
duration=$((end_time - start_time))
echo "[+] All agents completed in ${duration}s"
echo ""
# Step 3: Collect and aggregate results
echo "[*] Aggregating results..."
aggregated_output="$OUTPUT_DIR/aggregated_results.json"
{
echo "{"
echo " \"timestamp\": \"$(date -u +%Y-%m-%dT%H:%M:%SZ)\","
echo " \"total_agents\": $NUM_AGENTS,"
echo " \"duration_seconds\": $duration,"
echo " \"results\": ["
first=true
for result_file in "$OUTPUT_DIR"/analysis_*.json; do
if [ ! -f "$result_file" ]; then
continue
fi
if [ "$first" = true ]; then
first=false
else
echo ","
fi
cat "$result_file"
done
echo " ]"
echo "}"
} > "$aggregated_output"
echo "[+] Aggregated results: $aggregated_output"
wc -l "$aggregated_output" | awk '{print "[+] Total lines: " $1}'Step 4: Managing Concurrency Limits
If you have many files and want to avoid rate limiting, cap concurrent agents:
#!/bin/bash
# run-swarm-with-limits.sh - Swarm with concurrency control
SOURCE_DIR="${1:-.}"
TOTAL_AGENTS="${2:-100}"
CONCURRENT_LIMIT="${3:-10}" # Max 10 agents at a time
TEMP_DIR="/tmp/claude-swarm-partitions"
OUTPUT_DIR="./swarm-results"
mkdir -p "$OUTPUT_DIR"
echo "[*] Starting managed swarm..."
echo "[*] Total agents: $TOTAL_AGENTS | Concurrent limit: $CONCURRENT_LIMIT"
# Partition files into TOTAL_AGENTS partitions
./partition-files.sh "$SOURCE_DIR" "$TOTAL_AGENTS"
# Dispatch agents in batches
for batch_start in $(seq 0 $CONCURRENT_LIMIT $TOTAL_AGENTS); do
batch_end=$((batch_start + CONCURRENT_LIMIT - 1))
if [ "$batch_end" -ge "$TOTAL_AGENTS" ]; then
batch_end=$((TOTAL_AGENTS - 1))
fi
echo "[*] Dispatching batch: agents $batch_start-$batch_end"
for i in $(seq $batch_start $batch_end); do
partition_file="$TEMP_DIR/partition_$i.txt"
(
./analyze-partition.sh "$partition_file" "$i" "$OUTPUT_DIR"
) &
done
# Wait for batch to complete before starting next batch
wait
echo "[+] Batch $batch_start-$batch_end complete"
done
echo "[+] Swarm execution complete"This approach dispatches agents in waves—10 at a time, wait for them to finish, then dispatch 10 more. It prevents API rate limiting while maintaining parallelism. This is the sweet spot between speed and sustainability.
Deduplication and Result Aggregation
When multiple agents analyze the same codebase, they'll often find duplicate issues—two agents flagging the same bug or style violation. You need to deduplicate and merge results intelligently.
Here's a pattern for aggregating and deduplicating results:
#!/bin/bash
# aggregate-results.sh - Merge and deduplicate agent findings
OUTPUT_DIR="${1:-.}"
AGGREGATED_FILE="${OUTPUT_DIR}/final_report.json"
echo "[*] Aggregating results from $OUTPUT_DIR..."
# Merge all JSON results
jq -s '
{
metadata: {
timestamp: now | todate,
total_agents: length,
analysis_scope: "parallel code analysis"
},
findings: [
.[] | .issues[]?
] |
# Deduplicate by file + issue type + line number
group_by(.file + "|" + .type + "|" + (.line_number | tostring)) |
map({
file: .[0].file,
type: .[0].type,
line_number: .[0].line_number,
message: .[0].message,
severity: .[0].severity,
reported_by_agents: (map(.agent_id) | unique),
agent_count: (map(.agent_id) | unique | length)
}) |
sort_by(.severity, .file),
summary: {
total_issues: (
[.[] | .issues[]?] |
group_by(.file + "|" + .type + "|" + (.line_number | tostring)) |
length
),
by_severity: (
[.[] | .issues[]?] |
group_by(.severity) |
map({severity: .[0].severity, count: length}) |
sort_by(.severity)
),
by_file: (
[.[] | .issues[]?] |
group_by(.file) |
map({file: .[0].file, count: length}) |
sort_by(.count) |
reverse |
.[0:10]
)
}
}
' "$OUTPUT_DIR"/analysis_*.json > "$AGGREGATED_FILE"
echo "[+] Results aggregated: $AGGREGATED_FILE"
jq '.summary' "$AGGREGATED_FILE"This jq pipeline:
- Merges all results from agent analyses
- Groups issues by file + type + line (deduplicates)
- Tracks which agents reported each issue
- Generates summary statistics
The reported_by_agents field tells you which agents found the same issue—useful for validating that findings are consistent.
Resource Management and Best Practices
Swarms are powerful but come with pitfalls. Here's how to avoid them:
Pitfall 1: Spawning Too Many Agents at Once
Problem: Spawn 100 agents immediately → API rate limit → failure.
Solution: Use concurrency batching (as shown above). Keep concurrent agents ≤ 10-20 unless your quota is huge. Start conservative and scale up as you see success.
Pitfall 2: Not Tracking Agent Status
Problem: Dispatch 10 agents, assume they all finish, discover 3 crashed silently.
Solution: Log agent dispatch and completion. Use exit codes. Track which agents succeed/fail.
# Track agent success/failure
agent_results="$OUTPUT_DIR/agent_status.txt"
> "$agent_results"
for i in $(seq 0 9); do
(
./analyze-partition.sh "$TEMP_DIR/partition_$i.txt" "$i" "$OUTPUT_DIR"
echo "$i:SUCCESS" >> "$agent_results"
) || echo "$i:FAILED" >> "$agent_results" &
done
wait
failed_count=$(grep "FAILED" "$agent_results" | wc -l)
if [ "$failed_count" -gt 0 ]; then
echo "[-] $failed_count agents failed:"
grep "FAILED" "$agent_results"
fiPitfall 3: Over-Partitioning
Problem: Partition into 1000 agents for 100 files. Overhead + API calls > actual work.
Solution: Aim for chunk sizes of 5-50 items per agent. Let agents do meaningful work. The overhead of spawning an agent decreases as the work per agent increases.
Pitfall 4: Ignoring Memory/Disk
Problem: Each agent generates a 10MB result file. 100 agents = 1GB of temp data.
Solution: Stream results to a central location. Clean up intermediate files. Use temp directories wisely.
# Clean up partitions after agents finish
rm -f "$TEMP_DIR"/*.txt
# Clean up intermediate agent outputs (keep only aggregated)
rm -f "$OUTPUT_DIR"/analysis_*.jsonPitfall 5: No Failure Recovery
Problem: One agent fails. Entire swarm is incomplete.
Solution: Identify which partitions weren't processed. Re-run only those agents.
# Find which partitions completed
completed=$(ls -1 "$OUTPUT_DIR"/analysis_*.json 2>/dev/null | \
sed 's/.*analysis_//;s/.json//' | sort -n)
# Find which partitions are missing
all_indices=$(seq 0 $((NUM_AGENTS - 1)))
missing=$(comm -23 <(echo "$all_indices") <(echo "$completed"))
if [ -n "$missing" ]; then
echo "[!] Re-running missing agents: $missing"
for i in $missing; do
./analyze-partition.sh "$TEMP_DIR/partition_$i.txt" "$i" "$OUTPUT_DIR" &
done
wait
fiMonitoring and Observability
When you have 10+ agents running in parallel, you need visibility. Here's a pattern for logging:
#!/bin/bash
# swarm-logger.sh - Central logging for swarm operations
log_dir="${1:-.swarm-logs}"
mkdir -p "$log_dir"
# Function to log messages
log_message() {
local agent_id="$1"
local status="$2"
local message="$3"
timestamp=$(date -u +%Y-%m-%dT%H:%M:%SZ)
echo "[$timestamp] [Agent $agent_id] [$status] $message" \
>> "$log_dir/agent_$agent_id.log"
echo "[$timestamp] [Agent $agent_id] [$status] $message" \
>> "$log_dir/swarm.log"
}
# Main swarm with logging
for i in $(seq 0 9); do
(
log_message "$i" "START" "Beginning analysis"
# Do work here
./analyze-partition.sh "/tmp/partition_$i.txt" "$i" "./results"
exit_code=$?
if [ "$exit_code" -eq 0 ]; then
log_message "$i" "SUCCESS" "Completed analysis"
else
log_message "$i" "FAILED" "Exit code $exit_code"
fi
) &
done
wait
echo "[*] Swarm complete. Logs:"
tail -20 "$log_dir/swarm.log"Advanced: Scaling to Hundreds of Agents
For truly large-scale swarms (100+ agents), consider:
-
Persistent job queue: Use a database or message queue (Redis, RabbitMQ) to queue work partitions. Agents pull from the queue.
-
Distributed coordination: Agents report status to a central coordinator. The coordinator detects failures and dispatches retries.
-
Dynamic partitioning: Adjust chunk sizes based on previous run times. Slow partitions get split further.
-
Result streaming: Don't wait for all results. Stream findings to a central service as agents complete.
Here's a skeleton for a queued approach:
#!/bin/bash
# swarm-with-queue.sh - Distributed swarm using a work queue
QUEUE_FILE="/tmp/swarm-work-queue.jsonl"
NUM_WORKERS=10
# Populate queue with work items
echo "[*] Creating work queue..."
find ./src -name "*.js" | while read -r file; do
echo "{\"file\": \"$file\", \"status\": \"pending\"}" >> "$QUEUE_FILE"
done
# Worker loop: each worker processes items from queue
worker() {
local worker_id="$1"
local processed=0
while true; do
# Get next item from queue (atomic read + update)
item=$(grep '"status": "pending"' "$QUEUE_FILE" | head -1)
if [ -z "$item" ]; then
# Queue empty, worker done
break
fi
file=$(echo "$item" | jq -r '.file')
echo "[Worker $worker_id] Processing: $file"
# Do work
./analyze-file.sh "$file" > "/tmp/result_$worker_id.json"
# Mark as done (simplified; use a database for real concurrency)
sed -i "s|$file.*|$file\", \"status\": \"done\"}|" "$QUEUE_FILE"
((processed++))
done
echo "[Worker $worker_id] Processed $processed items"
}
# Spawn workers
for i in $(seq 1 $NUM_WORKERS); do
worker "$i" &
done
wait
echo "[+] Queue processing complete"This is a simplified example. In production, use proper message queues and databases for concurrency control.
Summary
Agent swarms transform long-running sequential tasks into fast, parallel operations. The pattern is simple: partition work → fan-out agents → fan-in results → deduplicate + aggregate.
Here's what you learned:
- Swarms are fan-out/fan-in: Divide work, dispatch agents, collect results.
- Partitioning strategies matter: By file, module, concern, test, endpoint.
- Concurrency limits are essential: Batch agents to avoid API rate limits.
- Deduplication is critical: Merge duplicate findings across agents.
- Monitoring is your friend: Log everything. Track failures. Re-run missing work.
- Scaling requires infrastructure: For 100+ agents, use queues and distributed coordination.
Whether you're linting a codebase, running tests, analyzing code quality, or documenting APIs, swarms give you 10x speedup with thoughtful partitioning and careful concurrency management.
The bottleneck is no longer speed—it's your willingness to architect for parallelism.
Real-World Case Studies
Case Study 1: Parallel Code Review for 10,000 Files
A large team needed to review 10,000 files for a specific security pattern. Sequential review would take 200+ minutes. Using agent swarms with 20 concurrent agents and concern-based partitioning:
- Security review agent: Scanning for vulnerability patterns
- Performance agent: Checking for performance anti-patterns
- Architecture agent: Verifying against architecture decisions
- Standards agent: Checking for code style violations
Result: Parallel execution across 20 agents handling different concerns on the same files, completed in 12 minutes. They found:
- 340 security vulnerabilities
- 210 performance improvements
- 450 style violations
Each issue was reported by multiple agents, giving high confidence in findings. The deduplication step unified findings and showed which issues were consensus (found by multiple agents) vs. one-agent-only concerns.
Case Study 2: Scaling Test Execution
A team's test suite grew from 500 to 5,000 tests, slowing CI/CD from 8 minutes to 45 minutes. Using test-suite partitioning with 10 concurrent test agents:
- Agent 1-5: Unit tests (fast)
- Agent 6-8: Integration tests (slower)
- Agent 9-10: E2E tests (slowest)
Agents were weighted by expected duration. Result: Test execution cut to 8 minutes—same speed despite 10x more tests. The unified report at the end showed exactly which tests failed, across all agents.
Case Study 3: Documentation Generation at Scale
Documenting 500 API endpoints manually took weeks. Using endpoint-partitioning:
- 50 agents dispatched in batches of 10
- Each agent documented 10 endpoints
- Results aggregated into a unified API reference
Each agent generated documentation for its endpoints, including examples, error cases, and usage notes. Deduplication merged common patterns (e.g., authentication sections). Result: Complete, consistent API documentation in hours instead of weeks.
The Economics of Swarms
When deciding whether to use swarms, consider the math:
Sequential: 100 files, 1 second per file = 100 seconds Swarm with 10 agents: 100 files, 10 agents = 10 seconds (10x speedup) But with overhead: Dispatch overhead, aggregation, deduplication adds ~2 seconds Real gain: ~8x speedup
The break-even point is around 50 items. Below that, sequential is fine. Above that, swarms pay for themselves in time saved.
But the real value isn't just time. It's also quality through redundancy. Multiple agents analyzing the same codebase catch different issues. A swarm might find 100 issues; a single agent might find 75. The difference is value.
Measuring Swarm Performance
Before you declare victory with a swarm, measure actual performance. Log the wall-clock time from dispatch to final result. Compare it to your sequential baseline. But don't stop there. Measure quality too. Are swarms finding the same issues as sequential runs? More issues? Different issues? Measure cost. Yes, swarms run faster, but they consume more API tokens (10 agents instead of 1). Is the speedup worth the additional cost? For some tasks yes, for others no. Track cost per result. If sequential costs $1 to find 100 issues and swarms cost $5 to find 105 issues, the math might not work. These measurements guide when to use swarms.
Common Mistakes and How to Avoid Them
Mistake 1: Using Swarms Too Early
Don't optimize prematurely. If your sequential process takes 10 seconds, don't build a swarm. Build swarms when you have real pain: tasks taking minutes or hours.
Mistake 2: Over-Partitioning
Spawning 1000 agents for 100 files is wasteful. The overhead of managing 1000 agents exceeds the savings from parallelism. Aim for 5-50 items per agent.
Mistake 3: Assuming All Partitions Finish Simultaneously
They don't. Some agents will finish in 2 seconds, others in 20. Don't wait for the slowest agent to finish before accepting results from fast ones. Stream results as they arrive.
Mistake 4: Forgetting to Handle Partial Failure
If 3 out of 10 agents fail, you now have 70% of the work done. Plan for this: retry failed agents, handle missing results gracefully, show partial results with metadata about what failed.
Mistake 5: Creating Tight Coupling Between Agents
Each agent should be independent. Don't have Agent 1's output be Agent 2's input in the same swarm. That's sequential, not parallel. If you need dependencies, do them in separate swarm batches.
Next Steps
Start small. Pick a task that:
- Takes more than a minute sequentially
- Can be partitioned into independent chunks
- Benefits from multiple perspectives
Build your first swarm. Test it. Measure the speedup. Learn from it. Then apply swarms to bigger challenges.
The pattern scales from 10 agents to 1000. The principles stay the same: partition, dispatch, aggregate, deduplicate. Master these and you can parallelize almost any task.
The Cultural Shift: From Sequential Thinking to Parallel Thinking
There's a mental model shift that happens when you start building swarms. Your brain gradually stops thinking in sequential steps and starts thinking in parallel chunks. Instead of "analyze these 100 files one by one," you think "partition these 100 files into 10 groups of 10, analyze each group in parallel, aggregate results."
This shift affects how you design tasks. You start asking different questions: Is this task parallelizable? Can I partition it? Can I run partitions independently? When you ask these questions repeatedly, you start seeing parallelization opportunities everywhere.
A task that seemed sequential suddenly becomes parallel. Updating documentation? Partition by section. Running tests? Partition by test suite. Reviewing code? Partition by concern (security, performance, style). Even seemingly sequential workflows have parallelization opportunities if you look for them.
This thinking extends beyond swarms. It changes how you architecture systems, how you approach problems, how you think about scalability. The mindset is valuable even in situations where you don't use swarms.
The Sustainability Question: When to Use Swarms
Not every task benefits from swarms. Sometimes sequential is better. A task where each partition depends on previous results? Sequential. A task with minimal work per partition? The overhead of spawning agents exceeds savings. A task that's already fast enough? Why complicate it?
Swarms are good for:
- Tasks taking minutes or hours sequentially
- Tasks with thousands of independent items
- Tasks where multiple perspectives add value
- Tasks where failure of one partition shouldn't stop others
Swarms are bad for:
- Tasks taking seconds sequentially (overhead overhead > savings)
- Tasks with tight dependencies
- Tasks with very few items (why spawn 10 agents for 10 items?)
- Tasks where strict sequential ordering is essential
The right question is: does this task have enough friction that parallelization saves time and adds value? If yes, build a swarm. If no, keep it simple.
The Long Game: Building Expertise
Using agent swarms effectively is a skill that develops over time. Your first swarm might be awkward—too many agents, poor partitioning, inefficient aggregation. Your fifth swarm will be smooth—good partition size, clean aggregation, optimal concurrency.
The learning curve is gentle. Start with simple file partitioning. Graduate to module partitioning. Then to concern-based partitioning. Then to custom strategies for your specific domain.
Each swarm teaches you something about your problem domain, your tools, your infrastructure. You build intuition about optimal partition sizes, about which concerns parallelize well, about where bottlenecks appear.
This accumulated expertise is what separates teams that use swarms occasionally from teams that build swarm-based systems that are core to their operations. The best swarms aren't magic. They're the result of experienced teams thinking carefully about parallelization.
Final Thoughts: Building a Swarm-Based Culture
The real power of agent swarms isn't just in their ability to parallelize work. It's in how they fundamentally change how you think about large-scale problems. Once you've experienced the joy of dispatching 20 agents to analyze 1000 files and getting comprehensive results in minutes instead of days, you can't unsee it. You start approaching every problem with a parallelization mindset.
That shift in perspective has ripple effects beyond engineering. Product teams start asking "can we parallelize this?" Managers start thinking about distributing work intelligently instead of linearly. Your entire organization becomes more comfortable with distributed thinking, which is increasingly essential as systems become more complex.
The economics compound over time. Your first swarm might save you a few hours. Your tenth swarm might save your company days of productivity. By your hundredth swarm, you're operating at a level of scale that would be impossible with sequential thinking. Teams that embrace swarms don't just work faster—they think differently, design differently, and solve problems differently.
The pattern is learnable, the tools are accessible, and the benefits are immediate and measurable. Start with a simple partitioning strategy, implement error handling, measure the speedup. Iterate. Before you know it, you'll have built the muscle memory to parallelize almost any task. And that's when the real acceleration happens.
-iNet