May 8, 2025
Claude AI Development

Agent Swarms Parallel Processing

You've got 100 files to lint, 50 test suites to run, or 1000 API endpoints to document. Do you send Claude Code to grind through them sequentially? That's one approach—and it'll work. But it'll also take forever, waste your time, and leave your local resources underutilized.

What if instead you spawned 10 agents in parallel, each handling a slice of the work? They finish their pieces simultaneously, you aggregate the results, deduplicate overlaps, and deliver a comprehensive report in a fraction of the time. That's an agent swarm—a powerful pattern in Claude Code that transforms bottleneck tasks into massively parallel operations.

This article is for advanced Claude Code users who are ready to move beyond single-agent workflows. We'll explore what swarms are, why they matter, how to architect them, and how to implement them with real code. By the end, you'll understand fan-out/fan-in patterns, partitioning strategies, resource management, and how to avoid the pitfalls that trip up newcomers.

Let's build something fast.


Table of Contents
  1. What Is an Agent Swarm?
  2. The Fan-Out/Fan-In Pattern
  3. Why Parallel Processing Matters
  4. Problem 1: Sequential Processing Is Slow
  5. Problem 2: Single Agents Miss Context
  6. Problem 3: API Rate Limits Require Careful Concurrency
  7. Problem 4: Resource Isolation
  8. Partitioning Strategies
  9. Strategy 1: Partition by File
  10. Strategy 2: Partition by Module/Directory
  11. Strategy 3: Partition by Concern
  12. Strategy 4: Partition by Test Suite
  13. Strategy 5: Partition by API Endpoint
  14. Implementing a Swarm: Full Example
  15. Step 1: Define Your Work Partitions
  16. Step 2: Create an Agent Task Command
  17. Step 3: Dispatch Agents in Parallel
  18. Step 4: Managing Concurrency Limits
  19. Deduplication and Result Aggregation
  20. Resource Management and Best Practices
  21. Pitfall 1: Spawning Too Many Agents at Once
  22. Pitfall 2: Not Tracking Agent Status
  23. Pitfall 3: Over-Partitioning
  24. Pitfall 4: Ignoring Memory/Disk
  25. Pitfall 5: No Failure Recovery
  26. Monitoring and Observability
  27. Advanced: Scaling to Hundreds of Agents
  28. Summary
  29. Real-World Case Studies
  30. Case Study 1: Parallel Code Review for 10,000 Files
  31. Case Study 2: Scaling Test Execution
  32. Case Study 3: Documentation Generation at Scale
  33. The Economics of Swarms
  34. Measuring Swarm Performance
  35. Common Mistakes and How to Avoid Them
  36. Mistake 1: Using Swarms Too Early
  37. Mistake 2: Over-Partitioning
  38. Mistake 3: Assuming All Partitions Finish Simultaneously
  39. Mistake 4: Forgetting to Handle Partial Failure
  40. Mistake 5: Creating Tight Coupling Between Agents
  41. Next Steps
  42. The Cultural Shift: From Sequential Thinking to Parallel Thinking
  43. The Sustainability Question: When to Use Swarms
  44. The Long Game: Building Expertise
  45. Final Thoughts: Building a Swarm-Based Culture

What Is an Agent Swarm?

An agent swarm is a coordinated group of independent agents working on partitioned pieces of the same task in parallel. Think of it like a construction crew: instead of one person building an entire house, you have electricians, plumbers, framers, and painters all working on different sections at the same time.

In Claude Code terms:

  • Swarm = multiple agents executing concurrently
  • Partition = divide work into independent units (files, modules, test suites, API endpoints)
  • Fan-out = dispatch agents to handle partitions
  • Fan-in = collect and aggregate results
  • Deduplication = merge overlapping findings (two agents finding the same issue)
  • Concurrency limit = cap simultaneous agents to avoid API rate limiting

The core benefit is velocity. If linting 100 files sequentially takes 100 seconds, running 10 agents in parallel (each handling 10 files) takes roughly 10 seconds. The speedup is near-linear up to resource and API limits.

But swarms aren't just about speed. They're also about parallelizing cognitive work—letting multiple specialized agents think independently on different concerns, then synthesizing their findings. A swarm of agents can catch issues, inconsistencies, and opportunities that a single agent might miss. One agent might see a performance issue, another a security vulnerability, and a third a maintainability problem—all in the same code. Together, they're more thorough.

The real-world impact is profound: what would have taken an expert 8 hours of sequential analysis now takes 45 minutes with a swarm of agents running in parallel. That's not just efficiency—it's a fundamental shift in how you think about large-scale code analysis and automation tasks.

Swarms also improve consistency. When you run a single agent against a massive codebase, it can lose context or become fatigued (in a metaphorical sense). Multiple specialized agents, each handling a focused partition, maintain consistent quality throughout. Each agent has fresh context for its partition, applies the same standards consistently, and doesn't degrade in performance as the workload increases.


The Fan-Out/Fan-In Pattern

The core pattern of agent swarms is fan-out/fan-in. It's beautifully simple:

  1. Fan-out: Partition work into N chunks. Spawn N agents, each handling one chunk.
  2. Parallel execution: All agents work simultaneously.
  3. Fan-in: Collect results from all agents. Aggregate, deduplicate, synthesize.

Here's the conceptual flow:

Task (1 unit of work)
    ↓
Partition (divide into N chunks)
    ↓
Fan-out (spawn N agents)
    ├→ Agent 1 (chunk 1) → Result 1
    ├→ Agent 2 (chunk 2) → Result 2
    ├→ Agent 3 (chunk 3) → Result 3
    └→ Agent N (chunk N) → Result N
    ↓
Fan-in (collect & aggregate)
    ↓
Unified Report

In practice, this pattern is implemented using background task dispatch and result collection. Claude Code's run_in_background parameter lets you kick off agents without waiting for them to finish. You dispatch them all, then collect results when they're ready.

The mathematical elegance of this pattern becomes clear when you think about latency. If you have 100 items to process and each takes 10 seconds sequentially, that's 1000 seconds total. With 10 agents in parallel, you're down to 100 seconds. With 20 agents, you're at 50 seconds. The relationship is linear until you hit API rate limits or local resource constraints. This scalability is why swarms are so powerful for high-volume tasks.

The fan-out/fan-in pattern also solves a critical problem: failure isolation. When you run a single agent against a massive workload, one failure means starting over from scratch. With swarms, if 9 agents finish and 1 fails, you've already captured 90% of your results. You retry just the failed partition, not the entire job. This resilience is especially valuable in production systems where re-running a full analysis might be expensive.

Furthermore, the cognitive benefits of this pattern are underrated. When Agent 1 analyzes files 1-10, it can build up specialized context about that subset. It learns the patterns, the quirks, the edge cases specific to those files. When Agent 2 analyzes files 11-20, it develops independent expertise about its subset. Later, during aggregation, you get two perspectives—two different specialized views of the same problem space. This diversity of perspective catches issues that a single agent would miss.

The pattern also naturally supports intelligent prioritization. You might run fast, cheap agents (Haiku) on straightforward partitions and reserve expensive agents (Opus) for complex analyses. The swarm architecture doesn't care—each agent runs independently. You can even adjust partition sizes based on expected complexity. Small, fast partitions for simple files; larger, more complex partitions for intricate code.


Why Parallel Processing Matters

Before we dive into implementation, let's talk about why you'd want swarms in the first place.

Problem 1: Sequential Processing Is Slow

If you're running a linter against 100 files, and each file takes 1 second to lint, sequential processing takes 100 seconds. Parallel processing with 10 agents cuts that to 10 seconds. That's a 10x speedup just from parallelization.

In real-world workflows—analyzing codebases, running test suites, documenting APIs, validating configurations—sequential processing becomes a bottleneck. Swarms unblock it. And the speedup is nearly linear up to your API quota and local resource limits. Double your agents, roughly halve your time (until you hit rate limits).

But the value isn't just about raw speed. It's about unblocking your team. A developer waiting 90 seconds for code analysis feedback loses context, switches tasks, and struggles to resume. Give them 9 seconds and they stay focused. That's a qualitative difference in developer experience that translates to productivity.

Problem 2: Single Agents Miss Context

A single agent processing all 100 files might miss subtle inconsistencies or patterns because it's overwhelmed by scale. Multiple agents working on smaller chunks can spot issues more clearly, then share findings during aggregation. This is cognitive parallel processing—each agent can focus more deeply on its partition, catching issues a context-overloaded agent would miss.

Consider a code review scenario. One agent reviewing 10 files can provide detailed, contextual feedback because it has focused attention. That same agent reviewing 100 files might provide surface-level feedback because it's cognitively overloaded. By partitioning into 10 agents of 10 files each, you get 10 sets of focused, high-quality analysis instead of 1 set of diluted analysis. The aggregate result is more thorough.

Problem 3: API Rate Limits Require Careful Concurrency

Claude Code uses the Anthropic API, which has rate limits. If you spawn 100 agents at once, you'll hit rate limits fast. Concurrency caps let you control parallelism—spawn 10 agents, wait for results, spawn 10 more—without burning through quota or triggering backoff. This is critical for sustainable swarms.

Rate limiting is a feature, not a bug. It forces you to architect thoughtfully. If you blindly spawn 1000 agents, you're not parallelizing—you're thrashing. By capping concurrency and executing in batches, you maintain throughput without triggering API throttling. Your swarm stays efficient even under load.

Problem 4: Resource Isolation

When agents work in isolation (each on a partition), they don't interfere with each other. One agent's memory explosion doesn't crash siblings. One agent's bug doesn't poison the entire run. Isolation is a feature—it makes swarms resilient.

This resilience is crucial for production systems. If you run a single agent against a critical codebase and it crashes halfway through, you've lost an hour's work. If you run 10 agents and one crashes, you've lost 10% of the work and can retry just that partition. That's the difference between "we need to restart this whole thing" and "we'll retry this one chunk."


Partitioning Strategies

The success of a swarm depends entirely on how you partition work. Bad partitioning = wasted parallelism. Good partitioning = real speedup.

Here are the main strategies:

Strategy 1: Partition by File

Perfect for linting, formatting, validation—any task where work per file is independent.

When to use: Analyzing codebases, linting, formatting, static analysis.

Example: 100 JavaScript files → spawn 10 agents, each linting 10 files.

bash
# Partition by file
files=$(find . -name "*.js" | sort)
total_files=$(echo "$files" | wc -l)
chunk_size=$((total_files / 10))  # 10 agents
 
partition_index=0
for file in $files; do
    agent_index=$((partition_index / chunk_size))
    echo "$file" >> "partition_$agent_index.txt"
    ((partition_index++))
done
 
# Dispatch 10 agents, each handling one partition
for i in {0..9}; do
    /dispatch lint-agent-files < partition_$i.txt &
done
wait

Strategy 2: Partition by Module/Directory

Useful when your codebase is organized into modules and each module can be analyzed independently.

When to use: Multi-module projects, microservices, plugin architectures.

Example: 5 modules → spawn 5 agents, each analyzing one module.

bash
# Find all module directories
modules=$(find ./modules -maxdepth 1 -type d | grep -v "^\./modules$")
 
# Dispatch one agent per module
for module in $modules; do
    /dispatch analyze-module "$module" &
done
wait

Strategy 3: Partition by Concern

Abstract partitioning where you divide work by what agents analyze, not where it lives.

When to use: Composite analysis (security, performance, style), multi-dimensional validation.

Example: Code review → spawn agents for security, performance, style, maintainability.

bash
# Partition by concern, not by file
/dispatch security-reviewer ./src &
/dispatch performance-analyzer ./src &
/dispatch style-checker ./src &
/dispatch maintainability-auditor ./src &
wait

Strategy 4: Partition by Test Suite

For parallel testing, split test suites across agents.

When to use: Slow test runs, large test suites, CI/CD pipelines.

Example: 100 test files → spawn 5 agents, each running 20 tests.

bash
# Find all test files and partition them
test_files=$(find . -name "*.test.js" -o -name "*.spec.js" | sort)
total_tests=$(echo "$test_files" | wc -l)
chunk_size=$((total_tests / 5))
 
partition_index=0
for test_file in $test_files; do
    agent_index=$((partition_index / chunk_size))
    echo "$test_file" >> "test_partition_$agent_index.txt"
    ((partition_index++))
done
 
# Dispatch 5 test agents in parallel
for i in {0..4}; do
    /dispatch test-agent < test_partition_$i.txt &
done
wait

Strategy 5: Partition by API Endpoint

For API documentation, testing, or validation tasks.

When to use: API exploration, endpoint validation, integration testing.

Example: 50 API endpoints → spawn 5 agents, each testing 10 endpoints.


Implementing a Swarm: Full Example

Let's build a complete working example: parallel code analysis. We'll spawn multiple agents to analyze different files in a codebase, each looking for different issues (bugs, style violations, performance anti-patterns), then aggregate the results.

Step 1: Define Your Work Partitions

First, identify what you're analyzing and how to partition it:

bash
#!/bin/bash
# partition-files.sh - Split files into chunks for parallel analysis
 
SOURCE_DIR="${1:-.}"
NUM_AGENTS="${2:-10}"
TEMP_DIR="/tmp/claude-swarm-partitions"
 
# Create temp directory
mkdir -p "$TEMP_DIR"
rm -f "$TEMP_DIR"/*.txt
 
# Find all target files
echo "[*] Scanning for files in $SOURCE_DIR..."
files=$(find "$SOURCE_DIR" \
    -type f \
    \( -name "*.js" -o -name "*.ts" -o -name "*.py" -o -name "*.go" \) \
    -not -path "*/node_modules/*" \
    -not -path "*/.git/*" \
    | sort)
 
file_count=$(echo "$files" | wc -l)
echo "[*] Found $file_count files"
 
if [ "$file_count" -eq 0 ]; then
    echo "[-] No files found. Exiting."
    exit 1
fi
 
# Calculate chunk size
chunk_size=$((file_count / NUM_AGENTS))
if [ "$chunk_size" -eq 0 ]; then
    chunk_size=1
    NUM_AGENTS="$file_count"
fi
 
echo "[*] Partitioning into $NUM_AGENTS agents (~$chunk_size files each)"
 
# Distribute files into partitions
partition_index=0
file_index=0
current_partition="$TEMP_DIR/partition_0.txt"
touch "$current_partition"
 
for file in $files; do
    echo "$file" >> "$current_partition"
    ((file_index++))
 
    # Create new partition when chunk is full
    if [ "$file_index" -ge "$chunk_size" ] && [ "$partition_index" -lt $((NUM_AGENTS - 1)) ]; then
        ((partition_index++))
        current_partition="$TEMP_DIR/partition_$partition_index.txt"
        touch "$current_partition"
        file_index=0
    fi
done
 
echo "[+] Partitions created in $TEMP_DIR"
ls -la "$TEMP_DIR"/partition_*.txt | wc -l | xargs echo "[+] Total partitions:"

Run it to create partitions:

bash
./partition-files.sh ./src 10

This creates 10 partition files, each listing ~10% of your source files.

Step 2: Create an Agent Task Command

Next, create a command that a single agent will execute against its partition:

bash
#!/bin/bash
# analyze-partition.sh - Single agent analyzes its partition
 
PARTITION_FILE="$1"
AGENT_ID="$2"
OUTPUT_DIR="${3:-.}"
 
if [ ! -f "$PARTITION_FILE" ]; then
    echo "[-] Partition file not found: $PARTITION_FILE"
    exit 1
fi
 
echo "[Agent $AGENT_ID] Starting analysis..."
output_file="$OUTPUT_DIR/analysis_$AGENT_ID.json"
 
# Read files from partition and analyze each
{
    echo "{"
    echo "  \"agent_id\": \"$AGENT_ID\","
    echo "  \"timestamp\": \"$(date -u +%Y-%m-%dT%H:%M:%SZ)\","
    echo "  \"files_analyzed\": [],"
    echo "  \"issues\": []"
    echo "}"
} > "$output_file"
 
file_count=0
while IFS= read -r file; do
    if [ -z "$file" ]; then
        continue
    fi
 
    echo "[Agent $AGENT_ID] Analyzing: $file"
 
    # Here's where you'd invoke Claude Code or a specialized skill
    # For this example, we'll simulate analysis with basic tooling
 
    # Count lines, complexity metrics, style issues
    line_count=$(wc -l < "$file" 2>/dev/null || echo "0")
    todo_count=$(grep -i "TODO\|FIXME\|HACK" "$file" 2>/dev/null | wc -l)
 
    # In a real scenario, you'd invoke Claude Code here:
    # /dispatch analyze-code-quality "$file"
 
    ((file_count++))
done < "$PARTITION_FILE"
 
echo "[Agent $AGENT_ID] Analysis complete. Processed $file_count files."
echo "[+] Results saved to: $output_file"

This is a simplified example. In reality, you'd invoke Claude Code commands or skills here to do the actual analysis.

Step 3: Dispatch Agents in Parallel

Now, the critical part—dispatch all agents and run them in parallel:

bash
#!/bin/bash
# run-swarm.sh - Dispatch and coordinate the agent swarm
 
SOURCE_DIR="${1:-.}"
NUM_AGENTS="${2:-10}"
TEMP_DIR="/tmp/claude-swarm-partitions"
OUTPUT_DIR="./swarm-results"
 
mkdir -p "$OUTPUT_DIR"
rm -f "$OUTPUT_DIR"/*.json
 
echo "[*] Starting Agent Swarm..."
echo "[*] Configuration:"
echo "    Source: $SOURCE_DIR"
echo "    Agents: $NUM_AGENTS"
echo "    Output: $OUTPUT_DIR"
echo ""
 
# Step 1: Partition files
./partition-files.sh "$SOURCE_DIR" "$NUM_AGENTS"
 
# Step 2: Dispatch agents in parallel
echo "[*] Dispatching $NUM_AGENTS agents..."
start_time=$(date +%s)
 
for i in $(seq 0 $((NUM_AGENTS - 1))); do
    partition_file="$TEMP_DIR/partition_$i.txt"
 
    if [ ! -f "$partition_file" ]; then
        echo "[-] Partition $i not found, skipping"
        continue
    fi
 
    # Dispatch agent in background
    (
        ./analyze-partition.sh "$partition_file" "$i" "$OUTPUT_DIR"
    ) &
 
    echo "[+] Agent $i dispatched"
done
 
# Wait for all agents to complete
echo "[*] Waiting for agents to complete..."
wait
 
end_time=$(date +%s)
duration=$((end_time - start_time))
 
echo "[+] All agents completed in ${duration}s"
echo ""
 
# Step 3: Collect and aggregate results
echo "[*] Aggregating results..."
aggregated_output="$OUTPUT_DIR/aggregated_results.json"
 
{
    echo "{"
    echo "  \"timestamp\": \"$(date -u +%Y-%m-%dT%H:%M:%SZ)\","
    echo "  \"total_agents\": $NUM_AGENTS,"
    echo "  \"duration_seconds\": $duration,"
    echo "  \"results\": ["
 
    first=true
    for result_file in "$OUTPUT_DIR"/analysis_*.json; do
        if [ ! -f "$result_file" ]; then
            continue
        fi
 
        if [ "$first" = true ]; then
            first=false
        else
            echo ","
        fi
        cat "$result_file"
    done
 
    echo "  ]"
    echo "}"
} > "$aggregated_output"
 
echo "[+] Aggregated results: $aggregated_output"
wc -l "$aggregated_output" | awk '{print "[+] Total lines: " $1}'

Step 4: Managing Concurrency Limits

If you have many files and want to avoid rate limiting, cap concurrent agents:

bash
#!/bin/bash
# run-swarm-with-limits.sh - Swarm with concurrency control
 
SOURCE_DIR="${1:-.}"
TOTAL_AGENTS="${2:-100}"
CONCURRENT_LIMIT="${3:-10}"  # Max 10 agents at a time
TEMP_DIR="/tmp/claude-swarm-partitions"
OUTPUT_DIR="./swarm-results"
 
mkdir -p "$OUTPUT_DIR"
 
echo "[*] Starting managed swarm..."
echo "[*] Total agents: $TOTAL_AGENTS | Concurrent limit: $CONCURRENT_LIMIT"
 
# Partition files into TOTAL_AGENTS partitions
./partition-files.sh "$SOURCE_DIR" "$TOTAL_AGENTS"
 
# Dispatch agents in batches
for batch_start in $(seq 0 $CONCURRENT_LIMIT $TOTAL_AGENTS); do
    batch_end=$((batch_start + CONCURRENT_LIMIT - 1))
    if [ "$batch_end" -ge "$TOTAL_AGENTS" ]; then
        batch_end=$((TOTAL_AGENTS - 1))
    fi
 
    echo "[*] Dispatching batch: agents $batch_start-$batch_end"
 
    for i in $(seq $batch_start $batch_end); do
        partition_file="$TEMP_DIR/partition_$i.txt"
 
        (
            ./analyze-partition.sh "$partition_file" "$i" "$OUTPUT_DIR"
        ) &
    done
 
    # Wait for batch to complete before starting next batch
    wait
    echo "[+] Batch $batch_start-$batch_end complete"
done
 
echo "[+] Swarm execution complete"

This approach dispatches agents in waves—10 at a time, wait for them to finish, then dispatch 10 more. It prevents API rate limiting while maintaining parallelism. This is the sweet spot between speed and sustainability.


Deduplication and Result Aggregation

When multiple agents analyze the same codebase, they'll often find duplicate issues—two agents flagging the same bug or style violation. You need to deduplicate and merge results intelligently.

Here's a pattern for aggregating and deduplicating results:

bash
#!/bin/bash
# aggregate-results.sh - Merge and deduplicate agent findings
 
OUTPUT_DIR="${1:-.}"
AGGREGATED_FILE="${OUTPUT_DIR}/final_report.json"
 
echo "[*] Aggregating results from $OUTPUT_DIR..."
 
# Merge all JSON results
jq -s '
    {
        metadata: {
            timestamp: now | todate,
            total_agents: length,
            analysis_scope: "parallel code analysis"
        },
        findings: [
            .[] | .issues[]?
        ] |
        # Deduplicate by file + issue type + line number
        group_by(.file + "|" + .type + "|" + (.line_number | tostring)) |
        map({
            file: .[0].file,
            type: .[0].type,
            line_number: .[0].line_number,
            message: .[0].message,
            severity: .[0].severity,
            reported_by_agents: (map(.agent_id) | unique),
            agent_count: (map(.agent_id) | unique | length)
        }) |
        sort_by(.severity, .file),
        summary: {
            total_issues: (
                [.[] | .issues[]?] |
                group_by(.file + "|" + .type + "|" + (.line_number | tostring)) |
                length
            ),
            by_severity: (
                [.[] | .issues[]?] |
                group_by(.severity) |
                map({severity: .[0].severity, count: length}) |
                sort_by(.severity)
            ),
            by_file: (
                [.[] | .issues[]?] |
                group_by(.file) |
                map({file: .[0].file, count: length}) |
                sort_by(.count) |
                reverse |
                .[0:10]
            )
        }
    }
' "$OUTPUT_DIR"/analysis_*.json > "$AGGREGATED_FILE"
 
echo "[+] Results aggregated: $AGGREGATED_FILE"
jq '.summary' "$AGGREGATED_FILE"

This jq pipeline:

  1. Merges all results from agent analyses
  2. Groups issues by file + type + line (deduplicates)
  3. Tracks which agents reported each issue
  4. Generates summary statistics

The reported_by_agents field tells you which agents found the same issue—useful for validating that findings are consistent.


Resource Management and Best Practices

Swarms are powerful but come with pitfalls. Here's how to avoid them:

Pitfall 1: Spawning Too Many Agents at Once

Problem: Spawn 100 agents immediately → API rate limit → failure.

Solution: Use concurrency batching (as shown above). Keep concurrent agents ≤ 10-20 unless your quota is huge. Start conservative and scale up as you see success.

Pitfall 2: Not Tracking Agent Status

Problem: Dispatch 10 agents, assume they all finish, discover 3 crashed silently.

Solution: Log agent dispatch and completion. Use exit codes. Track which agents succeed/fail.

bash
# Track agent success/failure
agent_results="$OUTPUT_DIR/agent_status.txt"
> "$agent_results"
 
for i in $(seq 0 9); do
    (
        ./analyze-partition.sh "$TEMP_DIR/partition_$i.txt" "$i" "$OUTPUT_DIR"
        echo "$i:SUCCESS" >> "$agent_results"
    ) || echo "$i:FAILED" >> "$agent_results" &
done
wait
 
failed_count=$(grep "FAILED" "$agent_results" | wc -l)
if [ "$failed_count" -gt 0 ]; then
    echo "[-] $failed_count agents failed:"
    grep "FAILED" "$agent_results"
fi

Pitfall 3: Over-Partitioning

Problem: Partition into 1000 agents for 100 files. Overhead + API calls > actual work.

Solution: Aim for chunk sizes of 5-50 items per agent. Let agents do meaningful work. The overhead of spawning an agent decreases as the work per agent increases.

Pitfall 4: Ignoring Memory/Disk

Problem: Each agent generates a 10MB result file. 100 agents = 1GB of temp data.

Solution: Stream results to a central location. Clean up intermediate files. Use temp directories wisely.

bash
# Clean up partitions after agents finish
rm -f "$TEMP_DIR"/*.txt
 
# Clean up intermediate agent outputs (keep only aggregated)
rm -f "$OUTPUT_DIR"/analysis_*.json

Pitfall 5: No Failure Recovery

Problem: One agent fails. Entire swarm is incomplete.

Solution: Identify which partitions weren't processed. Re-run only those agents.

bash
# Find which partitions completed
completed=$(ls -1 "$OUTPUT_DIR"/analysis_*.json 2>/dev/null | \
    sed 's/.*analysis_//;s/.json//' | sort -n)
 
# Find which partitions are missing
all_indices=$(seq 0 $((NUM_AGENTS - 1)))
missing=$(comm -23 <(echo "$all_indices") <(echo "$completed"))
 
if [ -n "$missing" ]; then
    echo "[!] Re-running missing agents: $missing"
    for i in $missing; do
        ./analyze-partition.sh "$TEMP_DIR/partition_$i.txt" "$i" "$OUTPUT_DIR" &
    done
    wait
fi

Monitoring and Observability

When you have 10+ agents running in parallel, you need visibility. Here's a pattern for logging:

bash
#!/bin/bash
# swarm-logger.sh - Central logging for swarm operations
 
log_dir="${1:-.swarm-logs}"
mkdir -p "$log_dir"
 
# Function to log messages
log_message() {
    local agent_id="$1"
    local status="$2"
    local message="$3"
 
    timestamp=$(date -u +%Y-%m-%dT%H:%M:%SZ)
    echo "[$timestamp] [Agent $agent_id] [$status] $message" \
        >> "$log_dir/agent_$agent_id.log"
    echo "[$timestamp] [Agent $agent_id] [$status] $message" \
        >> "$log_dir/swarm.log"
}
 
# Main swarm with logging
for i in $(seq 0 9); do
    (
        log_message "$i" "START" "Beginning analysis"
 
        # Do work here
        ./analyze-partition.sh "/tmp/partition_$i.txt" "$i" "./results"
        exit_code=$?
 
        if [ "$exit_code" -eq 0 ]; then
            log_message "$i" "SUCCESS" "Completed analysis"
        else
            log_message "$i" "FAILED" "Exit code $exit_code"
        fi
    ) &
done
 
wait
 
echo "[*] Swarm complete. Logs:"
tail -20 "$log_dir/swarm.log"

Advanced: Scaling to Hundreds of Agents

For truly large-scale swarms (100+ agents), consider:

  1. Persistent job queue: Use a database or message queue (Redis, RabbitMQ) to queue work partitions. Agents pull from the queue.

  2. Distributed coordination: Agents report status to a central coordinator. The coordinator detects failures and dispatches retries.

  3. Dynamic partitioning: Adjust chunk sizes based on previous run times. Slow partitions get split further.

  4. Result streaming: Don't wait for all results. Stream findings to a central service as agents complete.

Here's a skeleton for a queued approach:

bash
#!/bin/bash
# swarm-with-queue.sh - Distributed swarm using a work queue
 
QUEUE_FILE="/tmp/swarm-work-queue.jsonl"
NUM_WORKERS=10
 
# Populate queue with work items
echo "[*] Creating work queue..."
find ./src -name "*.js" | while read -r file; do
    echo "{\"file\": \"$file\", \"status\": \"pending\"}" >> "$QUEUE_FILE"
done
 
# Worker loop: each worker processes items from queue
worker() {
    local worker_id="$1"
    local processed=0
 
    while true; do
        # Get next item from queue (atomic read + update)
        item=$(grep '"status": "pending"' "$QUEUE_FILE" | head -1)
 
        if [ -z "$item" ]; then
            # Queue empty, worker done
            break
        fi
 
        file=$(echo "$item" | jq -r '.file')
        echo "[Worker $worker_id] Processing: $file"
 
        # Do work
        ./analyze-file.sh "$file" > "/tmp/result_$worker_id.json"
 
        # Mark as done (simplified; use a database for real concurrency)
        sed -i "s|$file.*|$file\", \"status\": \"done\"}|" "$QUEUE_FILE"
 
        ((processed++))
    done
 
    echo "[Worker $worker_id] Processed $processed items"
}
 
# Spawn workers
for i in $(seq 1 $NUM_WORKERS); do
    worker "$i" &
done
 
wait
echo "[+] Queue processing complete"

This is a simplified example. In production, use proper message queues and databases for concurrency control.


Summary

Agent swarms transform long-running sequential tasks into fast, parallel operations. The pattern is simple: partition work → fan-out agents → fan-in results → deduplicate + aggregate.

Here's what you learned:

  • Swarms are fan-out/fan-in: Divide work, dispatch agents, collect results.
  • Partitioning strategies matter: By file, module, concern, test, endpoint.
  • Concurrency limits are essential: Batch agents to avoid API rate limits.
  • Deduplication is critical: Merge duplicate findings across agents.
  • Monitoring is your friend: Log everything. Track failures. Re-run missing work.
  • Scaling requires infrastructure: For 100+ agents, use queues and distributed coordination.

Whether you're linting a codebase, running tests, analyzing code quality, or documenting APIs, swarms give you 10x speedup with thoughtful partitioning and careful concurrency management.

The bottleneck is no longer speed—it's your willingness to architect for parallelism.


Real-World Case Studies

Case Study 1: Parallel Code Review for 10,000 Files

A large team needed to review 10,000 files for a specific security pattern. Sequential review would take 200+ minutes. Using agent swarms with 20 concurrent agents and concern-based partitioning:

  • Security review agent: Scanning for vulnerability patterns
  • Performance agent: Checking for performance anti-patterns
  • Architecture agent: Verifying against architecture decisions
  • Standards agent: Checking for code style violations

Result: Parallel execution across 20 agents handling different concerns on the same files, completed in 12 minutes. They found:

  • 340 security vulnerabilities
  • 210 performance improvements
  • 450 style violations

Each issue was reported by multiple agents, giving high confidence in findings. The deduplication step unified findings and showed which issues were consensus (found by multiple agents) vs. one-agent-only concerns.

Case Study 2: Scaling Test Execution

A team's test suite grew from 500 to 5,000 tests, slowing CI/CD from 8 minutes to 45 minutes. Using test-suite partitioning with 10 concurrent test agents:

  • Agent 1-5: Unit tests (fast)
  • Agent 6-8: Integration tests (slower)
  • Agent 9-10: E2E tests (slowest)

Agents were weighted by expected duration. Result: Test execution cut to 8 minutes—same speed despite 10x more tests. The unified report at the end showed exactly which tests failed, across all agents.

Case Study 3: Documentation Generation at Scale

Documenting 500 API endpoints manually took weeks. Using endpoint-partitioning:

  • 50 agents dispatched in batches of 10
  • Each agent documented 10 endpoints
  • Results aggregated into a unified API reference

Each agent generated documentation for its endpoints, including examples, error cases, and usage notes. Deduplication merged common patterns (e.g., authentication sections). Result: Complete, consistent API documentation in hours instead of weeks.

The Economics of Swarms

When deciding whether to use swarms, consider the math:

Sequential: 100 files, 1 second per file = 100 seconds Swarm with 10 agents: 100 files, 10 agents = 10 seconds (10x speedup) But with overhead: Dispatch overhead, aggregation, deduplication adds ~2 seconds Real gain: ~8x speedup

The break-even point is around 50 items. Below that, sequential is fine. Above that, swarms pay for themselves in time saved.

But the real value isn't just time. It's also quality through redundancy. Multiple agents analyzing the same codebase catch different issues. A swarm might find 100 issues; a single agent might find 75. The difference is value.

Measuring Swarm Performance

Before you declare victory with a swarm, measure actual performance. Log the wall-clock time from dispatch to final result. Compare it to your sequential baseline. But don't stop there. Measure quality too. Are swarms finding the same issues as sequential runs? More issues? Different issues? Measure cost. Yes, swarms run faster, but they consume more API tokens (10 agents instead of 1). Is the speedup worth the additional cost? For some tasks yes, for others no. Track cost per result. If sequential costs $1 to find 100 issues and swarms cost $5 to find 105 issues, the math might not work. These measurements guide when to use swarms.

Common Mistakes and How to Avoid Them

Mistake 1: Using Swarms Too Early

Don't optimize prematurely. If your sequential process takes 10 seconds, don't build a swarm. Build swarms when you have real pain: tasks taking minutes or hours.

Mistake 2: Over-Partitioning

Spawning 1000 agents for 100 files is wasteful. The overhead of managing 1000 agents exceeds the savings from parallelism. Aim for 5-50 items per agent.

Mistake 3: Assuming All Partitions Finish Simultaneously

They don't. Some agents will finish in 2 seconds, others in 20. Don't wait for the slowest agent to finish before accepting results from fast ones. Stream results as they arrive.

Mistake 4: Forgetting to Handle Partial Failure

If 3 out of 10 agents fail, you now have 70% of the work done. Plan for this: retry failed agents, handle missing results gracefully, show partial results with metadata about what failed.

Mistake 5: Creating Tight Coupling Between Agents

Each agent should be independent. Don't have Agent 1's output be Agent 2's input in the same swarm. That's sequential, not parallel. If you need dependencies, do them in separate swarm batches.

Next Steps

Start small. Pick a task that:

  • Takes more than a minute sequentially
  • Can be partitioned into independent chunks
  • Benefits from multiple perspectives

Build your first swarm. Test it. Measure the speedup. Learn from it. Then apply swarms to bigger challenges.

The pattern scales from 10 agents to 1000. The principles stay the same: partition, dispatch, aggregate, deduplicate. Master these and you can parallelize almost any task.


The Cultural Shift: From Sequential Thinking to Parallel Thinking

There's a mental model shift that happens when you start building swarms. Your brain gradually stops thinking in sequential steps and starts thinking in parallel chunks. Instead of "analyze these 100 files one by one," you think "partition these 100 files into 10 groups of 10, analyze each group in parallel, aggregate results."

This shift affects how you design tasks. You start asking different questions: Is this task parallelizable? Can I partition it? Can I run partitions independently? When you ask these questions repeatedly, you start seeing parallelization opportunities everywhere.

A task that seemed sequential suddenly becomes parallel. Updating documentation? Partition by section. Running tests? Partition by test suite. Reviewing code? Partition by concern (security, performance, style). Even seemingly sequential workflows have parallelization opportunities if you look for them.

This thinking extends beyond swarms. It changes how you architecture systems, how you approach problems, how you think about scalability. The mindset is valuable even in situations where you don't use swarms.

The Sustainability Question: When to Use Swarms

Not every task benefits from swarms. Sometimes sequential is better. A task where each partition depends on previous results? Sequential. A task with minimal work per partition? The overhead of spawning agents exceeds savings. A task that's already fast enough? Why complicate it?

Swarms are good for:

  • Tasks taking minutes or hours sequentially
  • Tasks with thousands of independent items
  • Tasks where multiple perspectives add value
  • Tasks where failure of one partition shouldn't stop others

Swarms are bad for:

  • Tasks taking seconds sequentially (overhead overhead > savings)
  • Tasks with tight dependencies
  • Tasks with very few items (why spawn 10 agents for 10 items?)
  • Tasks where strict sequential ordering is essential

The right question is: does this task have enough friction that parallelization saves time and adds value? If yes, build a swarm. If no, keep it simple.

The Long Game: Building Expertise

Using agent swarms effectively is a skill that develops over time. Your first swarm might be awkward—too many agents, poor partitioning, inefficient aggregation. Your fifth swarm will be smooth—good partition size, clean aggregation, optimal concurrency.

The learning curve is gentle. Start with simple file partitioning. Graduate to module partitioning. Then to concern-based partitioning. Then to custom strategies for your specific domain.

Each swarm teaches you something about your problem domain, your tools, your infrastructure. You build intuition about optimal partition sizes, about which concerns parallelize well, about where bottlenecks appear.

This accumulated expertise is what separates teams that use swarms occasionally from teams that build swarm-based systems that are core to their operations. The best swarms aren't magic. They're the result of experienced teams thinking carefully about parallelization.

Final Thoughts: Building a Swarm-Based Culture

The real power of agent swarms isn't just in their ability to parallelize work. It's in how they fundamentally change how you think about large-scale problems. Once you've experienced the joy of dispatching 20 agents to analyze 1000 files and getting comprehensive results in minutes instead of days, you can't unsee it. You start approaching every problem with a parallelization mindset.

That shift in perspective has ripple effects beyond engineering. Product teams start asking "can we parallelize this?" Managers start thinking about distributing work intelligently instead of linearly. Your entire organization becomes more comfortable with distributed thinking, which is increasingly essential as systems become more complex.

The economics compound over time. Your first swarm might save you a few hours. Your tenth swarm might save your company days of productivity. By your hundredth swarm, you're operating at a level of scale that would be impossible with sequential thinking. Teams that embrace swarms don't just work faster—they think differently, design differently, and solve problems differently.

The pattern is learnable, the tools are accessible, and the benefits are immediate and measurable. Start with a simple partitioning strategy, implement error handling, measure the speedup. Iterate. Before you know it, you'll have built the muscle memory to parallelize almost any task. And that's when the real acceleration happens.


-iNet

Need help implementing this?

We build automation systems like this for clients every day.

Discuss Your Project