April 30, 2025
Claude AI Development

Agent SDK - Tool Use and Custom Tool Definitions

Here's the thing about AI agents: they're only as useful as the tools you give them. Claude Code's Agent SDK lets you do something powerful—define exactly what Claude can access, how it should access it, and what it should do with the results. No magic, no guessing. Just you, Claude, and the tools you decide to connect them with.

In this article, we're diving deep into custom tool definitions, the SDK's validation mechanisms, and how to build tools that make your agents genuinely productive. If you've ever wondered how to constrain Claude's access or extend its capabilities beyond the defaults, this is your guide.

Table of Contents
  1. Why Custom Tools Matter More Than You Think
  2. Anatomy of a Custom Tool Definition
  3. Building Safe, Focused Tools
  4. Example 1: Deployment Tool (Good vs. Bad)
  5. Example 2: Data Query Tool (Good vs. Bad)
  6. Registering Tools with Claude Code
  7. Handling Tool Errors Gracefully
  8. Tool Performance and Optimization
  9. Tool Composition: Building Complex Capabilities from Simple Tools
  10. Testing Your Tools
  11. Monitoring Tool Usage

Why Custom Tools Matter More Than You Think

Think about your typical workflow. You've got APIs that do important things. You've got file systems with sensitive data. You've got databases with business logic baked in. Claude's default toolset—Bash, Read, Write—gives you a foundation, but it's blunt. It's like handing someone a screwdriver when you really need a surgeon's scalpel.

This is where custom tools become transformative. The default tools are generic. They're powerful, but they require Claude to understand your system's internals. If you give Claude raw Bash access, it needs to understand your deployment infrastructure, your database connection strings, your API authentication schemes. You're pushing domain knowledge into Claude's context. You're asking Claude to operate at the wrong level of abstraction.

Custom tools solve this by raising the abstraction level. Instead of exposing raw Bash, expose a deploy_service tool that handles deployment details internally. Instead of exposing raw database access, expose domain-specific queries like find_user_by_email that enforce safety and performance constraints. Claude operates at the right level of abstraction. The tool handles the complexity.

The benefits cascade:

Safety: A custom tool can validate inputs before executing. It can log all invocations. It can reject dangerous operations.

Efficiency: Abstraction lets you optimize. A raw SELECT query might be slow, but a domain-specific find_user_by_email can use indexes and caching.

Clarity: Claude understands what the tool does without needing to know how. Less context needed for the same capability.

Auditability: Every custom tool call is logged with full context. You know exactly what Claude tried to do and whether it succeeded.

Anatomy of a Custom Tool Definition

A tool definition tells Claude: "Hey, I have this capability available. Here's what it does, what parameters it needs, and what it will return."

typescript
// Example: Database Query Tool
 
const userLookupTool = {
  name: "find_user_by_email",
 
  description:
    "Find a user record by email address. Returns user ID, name, created_date.",
 
  input_schema: {
    type: "object",
    properties: {
      email: {
        type: "string",
        description: "Email address to look up (must be valid email format)",
      },
    },
    required: ["email"],
  },
 
  execute: async (params: { email: string }) => {
    // Validation
    if (!isValidEmail(params.email)) {
      return { error: "Invalid email format", code: "INVALID_INPUT" };
    }
 
    // Execute query
    try {
      const user = await db.users.findOne({ email: params.email });
 
      if (!user) {
        return { found: false, message: "User not found" };
      }
 
      // Return only safe fields
      return {
        found: true,
        user: {
          id: user.id,
          name: user.name,
          email: user.email,
          created_date: user.created_date,
        },
      };
    } catch (error) {
      return { error: "Database query failed", code: "DB_ERROR" };
    }
  },
};

The tool has four essential parts:

  1. Name: How Claude refers to the tool
  2. Description: What the tool does (Claude reads this to understand when to use it)
  3. Input Schema: What parameters the tool accepts (JSON Schema format)
  4. Execute Function: The actual implementation

Building Safe, Focused Tools

The best custom tools are narrowly focused and heavily validated. Resist the temptation to build one huge tool that does everything. Build small tools that do one thing well.

Example 1: Deployment Tool (Good vs. Bad)

❌ BAD: Too Generic

typescript
const deployTool = {
  name: "run_command",
  description: "Run any shell command",
  input_schema: {
    type: "object",
    properties: {
      command: { type: "string" },
    },
  },
  execute: async (params) => {
    return exec(params.command);
  },
};

This is dangerous. Claude could run rm -rf / if confused. The tool is too general. Claude's context-switching, distractions, or misunderstandings could lead to disaster.

✅ GOOD: Narrowly Focused, Validated

typescript
const deployServiceTool = {
  name: "deploy_service",
  description:
    "Deploy a service to production. Validates configuration, runs health checks, rolls back on failure.",
 
  input_schema: {
    type: "object",
    properties: {
      service_name: {
        type: "string",
        description:
          "Service to deploy (must be one of: api, auth, payments, notifications)",
        enum: ["api", "auth", "payments", "notifications"],
      },
      version: {
        type: "string",
        description:
          "Git tag or commit SHA to deploy (must match released version)",
      },
      environment: {
        type: "string",
        enum: ["staging", "production"],
        description: "Deployment environment",
      },
    },
    required: ["service_name", "version", "environment"],
  },
 
  execute: async (params) => {
    // Whitelist known services
    const validServices = ["api", "auth", "payments", "notifications"];
    if (!validServices.includes(params.service_name)) {
      return { error: "Unknown service", code: "INVALID_SERVICE" };
    }
 
    // Verify version is released
    const releaseExists = await checkGitTag(params.version);
    if (!releaseExists) {
      return {
        error: "Version not found in releases",
        code: "UNKNOWN_VERSION",
      };
    }
 
    // Protect production
    if (params.environment === "production") {
      // Additional safeguards for production
      const approval = await requireApproval();
      if (!approval) {
        return {
          error: "Production deployment requires approval",
          code: "APPROVAL_REQUIRED",
        };
      }
    }
 
    // Execute deployment with safety checks
    try {
      const result = await deployService(
        params.service_name,
        params.version,
        params.environment,
      );
 
      // Run health checks
      const healthCheck = await runHealthChecks(params.service_name);
      if (!healthCheck.passed) {
        // Automatic rollback on health check failure
        await rollback(params.service_name, params.environment);
        return {
          error: "Deployment health check failed, rolled back",
          code: "HEALTH_CHECK_FAILED",
          details: healthCheck.failures,
        };
      }
 
      return {
        success: true,
        message: `Successfully deployed ${params.service_name} v${params.version} to ${params.environment}`,
        deployment_id: result.id,
        timestamp: new Date().toISOString(),
      };
    } catch (error) {
      return {
        error: "Deployment failed",
        code: "DEPLOYMENT_ERROR",
        details: error.message,
      };
    }
  },
};

This tool is focused. It does one thing: deploy a service safely. It validates inputs. It prevents mistakes. Claude can't accidentally deploy the wrong service or version. The tool enforces constraints.

Example 2: Data Query Tool (Good vs. Bad)

❌ BAD: Too Permissive

typescript
const querytool = {
  name: "query_database",
  description: "Execute any SQL query against the database",
  input_schema: {
    type: "object",
    properties: {
      sql: { type: "string", description: "SQL query to execute" },
    },
  },
  execute: async (params) => {
    return db.query(params.sql);
  },
};

Claude could delete all your data. It could accidentally write malicious queries. It could leak sensitive information.

✅ GOOD: Constrained, Safe

typescript
const userQueryTool = {
  name: "find_users",
  description:
    "Find users by criteria. Returns ID, name, email, and join date.",
 
  input_schema: {
    type: "object",
    properties: {
      criteria: {
        type: "object",
        properties: {
          email: {
            type: "string",
            description: "Exact email match (optional)",
          },
          name_contains: {
            type: "string",
            description: "Partial name match (optional)",
          },
          created_after: {
            type: "string",
            description: "ISO date string (optional)",
          },
          limit: {
            type: "integer",
            minimum: 1,
            maximum: 100,
            description: "Max results (default 10)",
          },
        },
      },
    },
  },
 
  execute: async (params) => {
    const criteria = params.criteria || {};
 
    // Build safe query using ORM
    let query = db.users.find();
 
    if (criteria.email) {
      query = query.where("email", "=", criteria.email);
    }
 
    if (criteria.name_contains) {
      // Use parameterized query to prevent SQL injection
      query = query.where("name", "LIKE", `%${criteria.name_contains}%`);
    }
 
    if (criteria.created_after) {
      query = query.where("created_at", ">=", new Date(criteria.created_after));
    }
 
    // Enforce limit
    const limit = criteria.limit || 10;
    query = query.limit(Math.min(limit, 100));
 
    // Execute and return safe fields only
    const users = await query.select("id", "name", "email", "created_at");
 
    return {
      found: users.length,
      users: users.map((u) => ({
        id: u.id,
        name: u.name,
        email: u.email,
        created_date: u.created_at.toISOString(),
      })),
    };
  },
};

This tool is safe. Claude can only query by whitelisted fields. The query is parameterized (prevents injection). The limit is capped. Only safe fields are returned. Claude has the capability it needs without dangerous flexibility.

Registering Tools with Claude Code

Once you've defined tools, register them with your agent:

typescript
// agent-sdk-setup.ts
 
import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
const tools = [
  userLookupTool,
  deployServiceTool,
  findUsersTool,
  // ... more custom tools
];
 
async function runAgent(userRequest: string) {
  const messages: Anthropic.MessageParam[] = [
    {
      role: "user",
      content: userRequest,
    },
  ];
 
  let response = await client.messages.create({
    model: "claude-3-5-sonnet-20241022",
    max_tokens: 1024,
    system: "You are a helpful assistant with access to custom tools.",
    tools: tools,
    messages: messages,
  });
 
  // Process tool calls
  while (response.stop_reason === "tool_use") {
    const toolUseBlock = response.content.find(
      (block) => block.type === "tool_use",
    );
 
    if (!toolUseBlock || toolUseBlock.type !== "tool_use") break;
 
    // Find the tool definition
    const tool = tools.find((t) => t.name === toolUseBlock.name);
 
    if (!tool) {
      console.error(`Tool not found: ${toolUseBlock.name}`);
      break;
    }
 
    // Execute the tool
    const toolResult = await tool.execute(toolUseBlock.input);
 
    // Add assistant message and tool result
    messages.push({
      role: "assistant",
      content: response.content,
    });
 
    messages.push({
      role: "user",
      content: [
        {
          type: "tool_result",
          tool_use_id: toolUseBlock.id,
          content: JSON.stringify(toolResult),
        },
      ],
    });
 
    // Get next response
    response = await client.messages.create({
      model: "claude-3-5-sonnet-20241022",
      max_tokens: 1024,
      system: "You are a helpful assistant with access to custom tools.",
      tools: tools,
      messages: messages,
    });
  }
 
  // Extract final text response
  const textBlock = response.content.find((block) => block.type === "text");
 
  if (textBlock && textBlock.type === "text") {
    console.log("Assistant:", textBlock.text);
  }
}
 
// Usage
runAgent("Find the user with email alice@example.com");

Expected output:

Tool calls made: find_users with criteria {email: "alice@example.com"}
Tool result: { found: 1, users: [{ id: 123, name: "Alice", email: "alice@example.com", created_date: "2024-01-15T..." }] }
Assistant: I found the user Alice (ID: 123). She created her account on January 15, 2024.

Handling Tool Errors Gracefully

Tools will fail. Network timeouts, database errors, validation failures. How you handle these failures determines whether the agent can recover or gets stuck.

typescript
const resilientTool = {
  name: "fetch_external_data",
  description: "Fetch data from external API with automatic retries",
 
  input_schema: {
    type: "object",
    properties: {
      endpoint: { type: "string", description: "API endpoint" },
      timeout_seconds: { type: "integer", minimum: 1, maximum: 30 },
    },
  },
 
  execute: async (params) => {
    const maxRetries = 3;
    let lastError;
 
    for (let attempt = 1; attempt <= maxRetries; attempt++) {
      try {
        const response = await fetch(params.endpoint, {
          timeout: (params.timeout_seconds || 10) * 1000,
          headers: { "User-Agent": "Claude Agent" },
        });
 
        if (!response.ok) {
          throw new Error(`HTTP ${response.status}: ${response.statusText}`);
        }
 
        const data = await response.json();
 
        return {
          success: true,
          data: data,
          retrieved_at: new Date().toISOString(),
        };
      } catch (error) {
        lastError = error;
 
        // Don't retry on client errors (4xx)
        if (error.message.includes("HTTP 4")) {
          return {
            error: "Client error",
            code: "INVALID_REQUEST",
            message: error.message,
          };
        }
 
        // Retry on server errors and network timeouts
        if (attempt < maxRetries) {
          const waitMs = Math.pow(2, attempt) * 1000; // Exponential backoff
          await new Promise((resolve) => setTimeout(resolve, waitMs));
          continue;
        }
      }
    }
 
    return {
      error: "Request failed after retries",
      code: "FETCH_FAILED",
      last_error: lastError.message,
      retries_attempted: maxRetries,
    };
  },
};

When tools fail, return structured error information. Claude can understand the failure and decide what to do—retry, use a different tool, ask for help, etc.

Tool Performance and Optimization

As your agents use tools more, performance becomes critical. A tool that takes 30 seconds to execute will slow down your agent dramatically.

typescript
// Cache expensive results
 
const cacheStore = new Map();
const cacheTTL = 60000; // 1 minute
 
const userLookupWithCache = {
  name: "find_user_cached",
  description: "Find a user by email (with caching)",
 
  input_schema: {
    type: "object",
    properties: {
      email: { type: "string" },
    },
  },
 
  execute: async (params) => {
    const cacheKey = `user:${params.email}`;
 
    // Check cache
    if (cacheStore.has(cacheKey)) {
      const { data, timestamp } = cacheStore.get(cacheKey);
      if (Date.now() - timestamp < cacheTTL) {
        return {
          success: true,
          user: data,
          source: "cache",
          cached_at: new Date(timestamp).toISOString(),
        };
      }
    }
 
    // Query database
    const user = await db.users.findOne({ email: params.email });
 
    if (!user) {
      return { found: false };
    }
 
    const userData = {
      id: user.id,
      name: user.name,
      email: user.email,
    };
 
    // Store in cache
    cacheStore.set(cacheKey, {
      data: userData,
      timestamp: Date.now(),
    });
 
    return {
      success: true,
      user: userData,
      source: "database",
    };
  },
};

Caching reduces latency dramatically. For tools that read static or semi-static data, caching is essential.

Tool Composition: Building Complex Capabilities from Simple Tools

Instead of one big tool that does everything, compose simple tools together:

typescript
// Simple tools
const getTodaysTasks = {
  /* ... */
};
const createTask = {
  /* ... */
};
const updateTask = {
  /* ... */
};
const deleteTask = {
  /* ... */
};
 
// Claude can compose them
// User: "I have 3 tasks. Complete the first one and add a reminder for the second one."
// Claude uses getTodaysTasks, updateTask to mark complete, createTask to add reminder.
 
// This is more flexible than a single "manageTasks" tool

Testing Your Tools

Before deploying tools to production agents, test them thoroughly:

typescript
// test-tools.ts
 
describe("Database Tools", () => {
  test("userLookupTool finds existing user", async () => {
    const result = await userLookupTool.execute({
      email: "test@example.com",
    });
 
    expect(result.found).toBe(true);
    expect(result.user.email).toBe("test@example.com");
  });
 
  test("userLookupTool returns error for invalid email", async () => {
    const result = await userLookupTool.execute({
      email: "not-an-email",
    });
 
    expect(result.error).toBeDefined();
    expect(result.code).toBe("INVALID_INPUT");
  });
 
  test("userLookupTool handles database errors gracefully", async () => {
    // Mock database failure
    db.users.findOne.mockImplementation(() => {
      throw new Error("Connection lost");
    });
 
    const result = await userLookupTool.execute({
      email: "test@example.com",
    });
 
    expect(result.error).toBeDefined();
    expect(result.code).toBe("DB_ERROR");
  });
});

Test your tools independently before integrating them with agents. This catches bugs early.

Monitoring Tool Usage

Track which tools your agents use and whether they're succeeding:

typescript
const toolMetrics = {
  calls_by_tool: new Map(),
  errors_by_tool: new Map(),
  average_execution_time: new Map(),
};
 
async function executeToolWithMetrics(tool, input) {
  const startTime = Date.now();
 
  try {
    const result = await tool.execute(input);
 
    // Track success
    const callCount = toolMetrics.calls_by_tool.get(tool.name) || 0;
    toolMetrics.calls_by_tool.set(tool.name, callCount + 1);
 
    // Track timing
    const duration = Date.now() - startTime;
    const avgTime = toolMetrics.average_execution_time.get(tool.name) || 0;
    toolMetrics.average_execution_time.set(tool.name, (avgTime + duration) / 2);
 
    return result;
  } catch (error) {
    // Track error
    const errorCount = toolMetrics.errors_by_tool.get(tool.name) || 0;
    toolMetrics.errors_by_tool.set(tool.name, errorCount + 1);
 
    throw error;
  }
}

Monitor your tools in production. When a tool starts failing or getting slow, you'll know immediately.


-iNet

Need help implementing this?

We build automation systems like this for clients every day.

Discuss Your Project