
Here's the thing about AI agents: they're only as useful as the tools you give them. Claude Code's Agent SDK lets you do something powerful—define exactly what Claude can access, how it should access it, and what it should do with the results. No magic, no guessing. Just you, Claude, and the tools you decide to connect them with.
In this article, we're diving deep into custom tool definitions, the SDK's validation mechanisms, and how to build tools that make your agents genuinely productive. If you've ever wondered how to constrain Claude's access or extend its capabilities beyond the defaults, this is your guide.
Table of Contents
- Why Custom Tools Matter More Than You Think
- Anatomy of a Custom Tool Definition
- Building Safe, Focused Tools
- Example 1: Deployment Tool (Good vs. Bad)
- Example 2: Data Query Tool (Good vs. Bad)
- Registering Tools with Claude Code
- Handling Tool Errors Gracefully
- Tool Performance and Optimization
- Tool Composition: Building Complex Capabilities from Simple Tools
- Testing Your Tools
- Monitoring Tool Usage
Why Custom Tools Matter More Than You Think
Think about your typical workflow. You've got APIs that do important things. You've got file systems with sensitive data. You've got databases with business logic baked in. Claude's default toolset—Bash, Read, Write—gives you a foundation, but it's blunt. It's like handing someone a screwdriver when you really need a surgeon's scalpel.
This is where custom tools become transformative. The default tools are generic. They're powerful, but they require Claude to understand your system's internals. If you give Claude raw Bash access, it needs to understand your deployment infrastructure, your database connection strings, your API authentication schemes. You're pushing domain knowledge into Claude's context. You're asking Claude to operate at the wrong level of abstraction.
Custom tools solve this by raising the abstraction level. Instead of exposing raw Bash, expose a deploy_service tool that handles deployment details internally. Instead of exposing raw database access, expose domain-specific queries like find_user_by_email that enforce safety and performance constraints. Claude operates at the right level of abstraction. The tool handles the complexity.
The benefits cascade:
Safety: A custom tool can validate inputs before executing. It can log all invocations. It can reject dangerous operations.
Efficiency: Abstraction lets you optimize. A raw SELECT query might be slow, but a domain-specific find_user_by_email can use indexes and caching.
Clarity: Claude understands what the tool does without needing to know how. Less context needed for the same capability.
Auditability: Every custom tool call is logged with full context. You know exactly what Claude tried to do and whether it succeeded.
Anatomy of a Custom Tool Definition
A tool definition tells Claude: "Hey, I have this capability available. Here's what it does, what parameters it needs, and what it will return."
// Example: Database Query Tool
const userLookupTool = {
name: "find_user_by_email",
description:
"Find a user record by email address. Returns user ID, name, created_date.",
input_schema: {
type: "object",
properties: {
email: {
type: "string",
description: "Email address to look up (must be valid email format)",
},
},
required: ["email"],
},
execute: async (params: { email: string }) => {
// Validation
if (!isValidEmail(params.email)) {
return { error: "Invalid email format", code: "INVALID_INPUT" };
}
// Execute query
try {
const user = await db.users.findOne({ email: params.email });
if (!user) {
return { found: false, message: "User not found" };
}
// Return only safe fields
return {
found: true,
user: {
id: user.id,
name: user.name,
email: user.email,
created_date: user.created_date,
},
};
} catch (error) {
return { error: "Database query failed", code: "DB_ERROR" };
}
},
};The tool has four essential parts:
- Name: How Claude refers to the tool
- Description: What the tool does (Claude reads this to understand when to use it)
- Input Schema: What parameters the tool accepts (JSON Schema format)
- Execute Function: The actual implementation
Building Safe, Focused Tools
The best custom tools are narrowly focused and heavily validated. Resist the temptation to build one huge tool that does everything. Build small tools that do one thing well.
Example 1: Deployment Tool (Good vs. Bad)
❌ BAD: Too Generic
const deployTool = {
name: "run_command",
description: "Run any shell command",
input_schema: {
type: "object",
properties: {
command: { type: "string" },
},
},
execute: async (params) => {
return exec(params.command);
},
};This is dangerous. Claude could run rm -rf / if confused. The tool is too general. Claude's context-switching, distractions, or misunderstandings could lead to disaster.
✅ GOOD: Narrowly Focused, Validated
const deployServiceTool = {
name: "deploy_service",
description:
"Deploy a service to production. Validates configuration, runs health checks, rolls back on failure.",
input_schema: {
type: "object",
properties: {
service_name: {
type: "string",
description:
"Service to deploy (must be one of: api, auth, payments, notifications)",
enum: ["api", "auth", "payments", "notifications"],
},
version: {
type: "string",
description:
"Git tag or commit SHA to deploy (must match released version)",
},
environment: {
type: "string",
enum: ["staging", "production"],
description: "Deployment environment",
},
},
required: ["service_name", "version", "environment"],
},
execute: async (params) => {
// Whitelist known services
const validServices = ["api", "auth", "payments", "notifications"];
if (!validServices.includes(params.service_name)) {
return { error: "Unknown service", code: "INVALID_SERVICE" };
}
// Verify version is released
const releaseExists = await checkGitTag(params.version);
if (!releaseExists) {
return {
error: "Version not found in releases",
code: "UNKNOWN_VERSION",
};
}
// Protect production
if (params.environment === "production") {
// Additional safeguards for production
const approval = await requireApproval();
if (!approval) {
return {
error: "Production deployment requires approval",
code: "APPROVAL_REQUIRED",
};
}
}
// Execute deployment with safety checks
try {
const result = await deployService(
params.service_name,
params.version,
params.environment,
);
// Run health checks
const healthCheck = await runHealthChecks(params.service_name);
if (!healthCheck.passed) {
// Automatic rollback on health check failure
await rollback(params.service_name, params.environment);
return {
error: "Deployment health check failed, rolled back",
code: "HEALTH_CHECK_FAILED",
details: healthCheck.failures,
};
}
return {
success: true,
message: `Successfully deployed ${params.service_name} v${params.version} to ${params.environment}`,
deployment_id: result.id,
timestamp: new Date().toISOString(),
};
} catch (error) {
return {
error: "Deployment failed",
code: "DEPLOYMENT_ERROR",
details: error.message,
};
}
},
};This tool is focused. It does one thing: deploy a service safely. It validates inputs. It prevents mistakes. Claude can't accidentally deploy the wrong service or version. The tool enforces constraints.
Example 2: Data Query Tool (Good vs. Bad)
❌ BAD: Too Permissive
const querytool = {
name: "query_database",
description: "Execute any SQL query against the database",
input_schema: {
type: "object",
properties: {
sql: { type: "string", description: "SQL query to execute" },
},
},
execute: async (params) => {
return db.query(params.sql);
},
};Claude could delete all your data. It could accidentally write malicious queries. It could leak sensitive information.
✅ GOOD: Constrained, Safe
const userQueryTool = {
name: "find_users",
description:
"Find users by criteria. Returns ID, name, email, and join date.",
input_schema: {
type: "object",
properties: {
criteria: {
type: "object",
properties: {
email: {
type: "string",
description: "Exact email match (optional)",
},
name_contains: {
type: "string",
description: "Partial name match (optional)",
},
created_after: {
type: "string",
description: "ISO date string (optional)",
},
limit: {
type: "integer",
minimum: 1,
maximum: 100,
description: "Max results (default 10)",
},
},
},
},
},
execute: async (params) => {
const criteria = params.criteria || {};
// Build safe query using ORM
let query = db.users.find();
if (criteria.email) {
query = query.where("email", "=", criteria.email);
}
if (criteria.name_contains) {
// Use parameterized query to prevent SQL injection
query = query.where("name", "LIKE", `%${criteria.name_contains}%`);
}
if (criteria.created_after) {
query = query.where("created_at", ">=", new Date(criteria.created_after));
}
// Enforce limit
const limit = criteria.limit || 10;
query = query.limit(Math.min(limit, 100));
// Execute and return safe fields only
const users = await query.select("id", "name", "email", "created_at");
return {
found: users.length,
users: users.map((u) => ({
id: u.id,
name: u.name,
email: u.email,
created_date: u.created_at.toISOString(),
})),
};
},
};This tool is safe. Claude can only query by whitelisted fields. The query is parameterized (prevents injection). The limit is capped. Only safe fields are returned. Claude has the capability it needs without dangerous flexibility.
Registering Tools with Claude Code
Once you've defined tools, register them with your agent:
// agent-sdk-setup.ts
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const tools = [
userLookupTool,
deployServiceTool,
findUsersTool,
// ... more custom tools
];
async function runAgent(userRequest: string) {
const messages: Anthropic.MessageParam[] = [
{
role: "user",
content: userRequest,
},
];
let response = await client.messages.create({
model: "claude-3-5-sonnet-20241022",
max_tokens: 1024,
system: "You are a helpful assistant with access to custom tools.",
tools: tools,
messages: messages,
});
// Process tool calls
while (response.stop_reason === "tool_use") {
const toolUseBlock = response.content.find(
(block) => block.type === "tool_use",
);
if (!toolUseBlock || toolUseBlock.type !== "tool_use") break;
// Find the tool definition
const tool = tools.find((t) => t.name === toolUseBlock.name);
if (!tool) {
console.error(`Tool not found: ${toolUseBlock.name}`);
break;
}
// Execute the tool
const toolResult = await tool.execute(toolUseBlock.input);
// Add assistant message and tool result
messages.push({
role: "assistant",
content: response.content,
});
messages.push({
role: "user",
content: [
{
type: "tool_result",
tool_use_id: toolUseBlock.id,
content: JSON.stringify(toolResult),
},
],
});
// Get next response
response = await client.messages.create({
model: "claude-3-5-sonnet-20241022",
max_tokens: 1024,
system: "You are a helpful assistant with access to custom tools.",
tools: tools,
messages: messages,
});
}
// Extract final text response
const textBlock = response.content.find((block) => block.type === "text");
if (textBlock && textBlock.type === "text") {
console.log("Assistant:", textBlock.text);
}
}
// Usage
runAgent("Find the user with email alice@example.com");Expected output:
Tool calls made: find_users with criteria {email: "alice@example.com"}
Tool result: { found: 1, users: [{ id: 123, name: "Alice", email: "alice@example.com", created_date: "2024-01-15T..." }] }
Assistant: I found the user Alice (ID: 123). She created her account on January 15, 2024.
Handling Tool Errors Gracefully
Tools will fail. Network timeouts, database errors, validation failures. How you handle these failures determines whether the agent can recover or gets stuck.
const resilientTool = {
name: "fetch_external_data",
description: "Fetch data from external API with automatic retries",
input_schema: {
type: "object",
properties: {
endpoint: { type: "string", description: "API endpoint" },
timeout_seconds: { type: "integer", minimum: 1, maximum: 30 },
},
},
execute: async (params) => {
const maxRetries = 3;
let lastError;
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
const response = await fetch(params.endpoint, {
timeout: (params.timeout_seconds || 10) * 1000,
headers: { "User-Agent": "Claude Agent" },
});
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${response.statusText}`);
}
const data = await response.json();
return {
success: true,
data: data,
retrieved_at: new Date().toISOString(),
};
} catch (error) {
lastError = error;
// Don't retry on client errors (4xx)
if (error.message.includes("HTTP 4")) {
return {
error: "Client error",
code: "INVALID_REQUEST",
message: error.message,
};
}
// Retry on server errors and network timeouts
if (attempt < maxRetries) {
const waitMs = Math.pow(2, attempt) * 1000; // Exponential backoff
await new Promise((resolve) => setTimeout(resolve, waitMs));
continue;
}
}
}
return {
error: "Request failed after retries",
code: "FETCH_FAILED",
last_error: lastError.message,
retries_attempted: maxRetries,
};
},
};When tools fail, return structured error information. Claude can understand the failure and decide what to do—retry, use a different tool, ask for help, etc.
Tool Performance and Optimization
As your agents use tools more, performance becomes critical. A tool that takes 30 seconds to execute will slow down your agent dramatically.
// Cache expensive results
const cacheStore = new Map();
const cacheTTL = 60000; // 1 minute
const userLookupWithCache = {
name: "find_user_cached",
description: "Find a user by email (with caching)",
input_schema: {
type: "object",
properties: {
email: { type: "string" },
},
},
execute: async (params) => {
const cacheKey = `user:${params.email}`;
// Check cache
if (cacheStore.has(cacheKey)) {
const { data, timestamp } = cacheStore.get(cacheKey);
if (Date.now() - timestamp < cacheTTL) {
return {
success: true,
user: data,
source: "cache",
cached_at: new Date(timestamp).toISOString(),
};
}
}
// Query database
const user = await db.users.findOne({ email: params.email });
if (!user) {
return { found: false };
}
const userData = {
id: user.id,
name: user.name,
email: user.email,
};
// Store in cache
cacheStore.set(cacheKey, {
data: userData,
timestamp: Date.now(),
});
return {
success: true,
user: userData,
source: "database",
};
},
};Caching reduces latency dramatically. For tools that read static or semi-static data, caching is essential.
Tool Composition: Building Complex Capabilities from Simple Tools
Instead of one big tool that does everything, compose simple tools together:
// Simple tools
const getTodaysTasks = {
/* ... */
};
const createTask = {
/* ... */
};
const updateTask = {
/* ... */
};
const deleteTask = {
/* ... */
};
// Claude can compose them
// User: "I have 3 tasks. Complete the first one and add a reminder for the second one."
// Claude uses getTodaysTasks, updateTask to mark complete, createTask to add reminder.
// This is more flexible than a single "manageTasks" toolTesting Your Tools
Before deploying tools to production agents, test them thoroughly:
// test-tools.ts
describe("Database Tools", () => {
test("userLookupTool finds existing user", async () => {
const result = await userLookupTool.execute({
email: "test@example.com",
});
expect(result.found).toBe(true);
expect(result.user.email).toBe("test@example.com");
});
test("userLookupTool returns error for invalid email", async () => {
const result = await userLookupTool.execute({
email: "not-an-email",
});
expect(result.error).toBeDefined();
expect(result.code).toBe("INVALID_INPUT");
});
test("userLookupTool handles database errors gracefully", async () => {
// Mock database failure
db.users.findOne.mockImplementation(() => {
throw new Error("Connection lost");
});
const result = await userLookupTool.execute({
email: "test@example.com",
});
expect(result.error).toBeDefined();
expect(result.code).toBe("DB_ERROR");
});
});Test your tools independently before integrating them with agents. This catches bugs early.
Monitoring Tool Usage
Track which tools your agents use and whether they're succeeding:
const toolMetrics = {
calls_by_tool: new Map(),
errors_by_tool: new Map(),
average_execution_time: new Map(),
};
async function executeToolWithMetrics(tool, input) {
const startTime = Date.now();
try {
const result = await tool.execute(input);
// Track success
const callCount = toolMetrics.calls_by_tool.get(tool.name) || 0;
toolMetrics.calls_by_tool.set(tool.name, callCount + 1);
// Track timing
const duration = Date.now() - startTime;
const avgTime = toolMetrics.average_execution_time.get(tool.name) || 0;
toolMetrics.average_execution_time.set(tool.name, (avgTime + duration) / 2);
return result;
} catch (error) {
// Track error
const errorCount = toolMetrics.errors_by_tool.get(tool.name) || 0;
toolMetrics.errors_by_tool.set(tool.name, errorCount + 1);
throw error;
}
}Monitor your tools in production. When a tool starts failing or getting slow, you'll know immediately.
-iNet