Building a Full Governance Layer for Claude Code

You're running Claude Code at scale in your organization. Multiple teams are using agents. Some are building databases. Others are modifying critical infrastructure. You need visibility into what's happening, control over what's allowed, and audit trails for compliance.
But right now, you have no governance. Any agent can run any command. Delete production databases? Sure. Commit directly to main? No problem. Expose API keys in logs? Go right ahead. There's no policy enforcement, no approval gates, no compliance tracking.
This is where governance layers come in. A well-designed governance system sits between your agents and the tools they use, enforcing policies automatically. It logs everything. It blocks dangerous actions. It asks for approval when needed. It generates compliance reports.
In this article, we're building a production-grade governance layer from scratch. Not just one hook, but an integrated system of multiple hooks, a central policy engine, approval gates, and compliance tracking. This is the architecture real organizations use when they can't afford mistakes.
Table of Contents
- The Governance Layers
- Why Governance Matters at Scale
- The Cost of Not Having Governance
- Layer 1: Central Policy Configuration
- Layer 2: The PreToolUse Governance Hook
- Layer 3: Approval Processing
- Layer 4 & 5: Compliance Dashboard and Reporting
- Common Governance Mistakes
- Deploying Governance in Stages
- Real-World Scenarios: Where Governance Saves You
- Building Governance Culture
- Integration with Existing Tools
- Monitoring and Evolving Governance
- Common Governance Mistakes and How to Avoid Them
- Real-World Scenario: The Ransomware Defense
- Audit Logging in Detail
- Designing Approval Workflows
- Integrating Governance with Your Existing Tools
- The Psychology of Governance
- The Hidden Why
The Governance Layers
Think of governance as concentric rings. Each layer adds a new capability:
-
PreToolUse Hooks: Intercept before any tool executes. Log everything. Block dangerous patterns. Request approval.
-
Policy Engine: Central config that defines what's allowed, what's forbidden, and what needs approval.
-
Approval Queue: For high-risk actions, route to human approvers. Wait for sign-off.
-
Audit Logging: Every decision, every block, every approval. Write to immutable logs.
-
Compliance Dashboard: Query the audit logs. Generate reports. Track compliance over time.
-
Alert System: When policy violations occur, notify security/ops teams in real-time.
These layers work together to create a complete governance system. The policy engine defines the rules. The hooks enforce the rules. The approval queue handles decisions that need humans. Audit logging creates the historical record. The dashboard visualizes that history. And alerts notify stakeholders in real-time when things go wrong.
The beauty of layered governance is that you don't need all six layers immediately. You can start with hooks and policy, then add approval queue, then add dashboards. Each layer provides additional capability without the previous layers becoming obsolete. It's additive security, not rip-and-replace.
Let's build each layer.
Why Governance Matters at Scale
When you're operating with dozens of Claude Code agents across your organization, governance isn't optional—it's existential. Without it, you're trusting that everyone makes good decisions, that no one accidentally runs destructive commands, that credentials never leak. That's a bet you'll lose.
A governance layer doesn't eliminate risk; it makes risk visible and preventable. It surfaces dangerous patterns before they cause damage. It creates audit trails that prove compliance to external auditors. Most importantly, it shifts the burden from human vigilance to automated enforcement.
Think about what happens without governance: An agent is prompted to "optimize the database." It reads an outdated runbook mentioning a dangerous optimization technique. It runs DROP TABLE temp_data; because that's what the runbook says. No confirmation. No logging. No warning. The temp data was actually critical. You've lost hours of work.
With governance, the DROP TABLE command would be caught immediately. It would require approval. You'd get a notification. An approver could say "wait, that's wrong" and block it before any damage occurred.
Real organizations implement governance because they've learned the hard way that humans can't be trusted with critical infrastructure access. A single developer, tired at 11 PM, runs a query they meant to test in staging but actually ran against production. Without governance, that costs you millions. With it, the governance layer catches the anomalous behavior and asks for explicit confirmation before proceeding.
The Cost of Not Having Governance
Let's be concrete about what can go wrong without governance. An agent with write access to your infrastructure repository accidentally modifies a Terraform file, changing a load balancer configuration. This change gets committed and deployed. Your API endpoints are now unreachable for forty minutes. You lose revenue. Customers are angry. The post-mortem reveals the agent did exactly what it was asked to do, but nobody thought to verify the request came from a trusted source.
Another scenario: an agent is instructed to "delete old test data." It interprets this broadly and deletes test_users table, but that table is actually used by your acceptance tests in production. Your pipeline breaks. Deployments are blocked for three hours. Product launches are delayed.
Or consider this: an attacker compromises a developer's machine. The attacker tells Claude Code to add a hidden data exfiltration script. The agent generates the code and commits it. Without governance logging every action, the breach goes undetected for weeks. By the time you notice unusual database queries, millions of user records have been extracted.
These aren't hypothetical. They're scenarios that have happened at real organizations without proper governance. The cost isn't just the immediate impact (downtime, data loss, security incident). It's the legal liability, the customer trust erosion, and the months spent in incident response and forensics. Governance prevents these scenarios by enforcing guardrails that prevent accidents and detect attacks early.
Understanding governance is also about understanding organizational risk tolerance. Your organization has a risk budget. Governance is how you spend that budget wisely. By implementing governance, you're saying "we allow agents to do these things because we've evaluated the risk and determined it's acceptable." What you block is equally important as what you allow. Blocking direct commits to main isn't because you hate convenience—it's because accidental deploys of untested code cost more than the friction of requiring a PR. It's a deliberate tradeoff: friction now prevents catastrophe later.
The other crucial insight is that governance is a forcing function for clarity. When you write out "agents can run git log but not git push," you're forced to be explicit about what automation is allowed. This clarity ripples through your organization. Engineers know the boundaries. Agents know the boundaries. Security auditors can point to the policy and verify compliance. This clarity prevents the ambiguous situations where nobody knows if something is allowed.
Layer 1: Central Policy Configuration
Start with a JSON policy file that defines your governance rules. This is your source of truth for what's allowed, blocked, or requires approval. Everything else flows from this:
// .claude/governance/policies.json
{
"version": "1.0",
"policies": {
"bash_commands": {
"blocked_patterns": [
"rm -rf /",
"DROP TABLE",
"DELETE FROM.*WHERE",
"chmod 777"
],
"allowed_patterns": [
"git log",
"npm test",
"npm run build"
],
"requires_approval": [
"git push.*main",
"git push.*--force",
"npm publish",
"docker push"
],
"max_runtime_seconds": 300
},
"file_operations": {
"protected_paths": [
".env*",
"secrets.json",
"credentials.yaml",
"*.pem",
"*.key"
],
"write_protected_paths": [
".git/**",
".claude/governance/**",
"package-lock.json"
],
"delete_requires_approval": true
},
"git_operations": {
"blocked_branches": ["main", "production", "staging"],
"requires_approval_branches": ["main", "production"],
"require_pr_for_main": true,
"require_code_review": true,
"block_force_push": true
},
"database_operations": {
"blocked_operations": [
"DROP TABLE",
"TRUNCATE TABLE",
"DELETE FROM",
"ALTER TABLE DROP"
],
"readonly_schemas": ["audit_log", "system"],
"requires_approval": ["CREATE TABLE", "ALTER TABLE"]
},
"ai_operations": {
"max_cost_per_action": 0.10,
"max_context_tokens": 180000,
"blocked_models": [],
"requires_approval_models": ["claude-3-5-sonnet"]
}
},
"approval_rules": {
"high_risk_actions": {
"threshold": "high",
"approvers": ["team-lead", "security-lead"],
"timeout_minutes": 30,
"require_mfa": true
},
"medium_risk_actions": {
"threshold": "medium",
"approvers": ["team-lead"],
"timeout_minutes": 60,
"require_mfa": false
}
},
"audit": {
"log_all_actions": true,
"log_blocked_actions": true,
"log_approvals": true,
"retention_days": 365
}
}This configuration becomes the single source of truth that all governance components reference. It's intentionally structured to be human-readable, versioned in git, and auditable. When you need to tighten or relax policies, you modify this one file. No hidden magic. Everything is explicit.
Notice the structure: bash commands have blocked patterns, allowed patterns, and approval-requiring patterns. File operations protect sensitive paths. Git operations prevent direct pushes to production branches. Database operations block destructive commands. Each section is specific to a domain, making policies easier to reason about.
Layer 2: The PreToolUse Governance Hook
The hook reads the policy and enforces it. This is the enforcement layer that intercepts every tool use before it happens. It's the gatekeeper:
// .claude/hooks/pre-tool-use-governance.js
const fs = require("fs");
const path = require("path");
const { v4: uuidv4 } = require("uuid");
const POLICIES_PATH = path.join(__dirname, "../governance/policies.json");
const AUDIT_LOG_PATH = path.join(__dirname, "../governance/audit.log");
const APPROVAL_QUEUE_PATH = path.join(
__dirname,
"../governance/approval-queue.json",
);
let policies = {};
// Load policies on startup
function loadPolicies() {
if (fs.existsSync(POLICIES_PATH)) {
policies = JSON.parse(fs.readFileSync(POLICIES_PATH, "utf8"));
}
}
module.exports = async (hookContext) => {
loadPolicies();
const { tool, toolInput, userId, timestamp } = hookContext;
const action = {
id: uuidv4(),
tool,
toolInput,
userId,
timestamp,
status: "pending",
};
// Route based on tool type
if (tool === "bash") {
return governBashCommand(action, hookContext);
} else if (
tool === "readFile" ||
tool === "writeFile" ||
tool === "deleteFile"
) {
return governFileOperation(action, hookContext);
} else if (tool === "git") {
return governGitOperation(action, hookContext);
} else if (tool === "submitPrompt" && isAIOperation(toolInput)) {
return governAIOperation(action, hookContext);
}
// Unknown tools pass through
return {
toolInput,
allowExecution: true,
};
};
function governBashCommand(action, hookContext) {
const { toolInput } = hookContext;
const command = toolInput.command || "";
const policy = policies.policies.bash_commands;
// Check for blocked patterns
for (const pattern of policy.blocked_patterns || []) {
if (matchesPattern(command, pattern)) {
auditLog(action, "BLOCKED", `Matches blocked pattern: ${pattern}`);
return {
toolInput,
allowExecution: false,
logMessage: `Command blocked: matches dangerous pattern "${pattern}"`,
};
}
}
// Check for approval requirements
for (const pattern of policy.requires_approval || []) {
if (matchesPattern(command, pattern)) {
const approvalId = requestApproval(
action,
`Bash command requires approval: ${command}`,
);
auditLog(action, "PENDING_APPROVAL", `Approval ID: ${approvalId}`);
return {
toolInput,
allowExecution: false,
logMessage: `Approval required. Approval ID: ${approvalId}`,
requiresApproval: true,
approvalId,
};
}
}
// Allowed
auditLog(action, "ALLOWED", "Matches allowed patterns or no restrictions");
return {
toolInput,
allowExecution: true,
logMessage: "Bash command allowed",
};
}
function governFileOperation(action, hookContext) {
const { tool, toolInput } = hookContext;
const filePath = toolInput.path || "";
const policy = policies.policies.file_operations;
// Check if file is protected
if (isProtectedPath(filePath, policy.protected_paths)) {
auditLog(action, "BLOCKED", `Accessing protected file: ${filePath}`);
return {
toolInput,
allowExecution: false,
logMessage: `Cannot access protected file: ${filePath}`,
};
}
// Check if write is to protected path
if (
(tool === "writeFile" || tool === "deleteFile") &&
isProtectedPath(filePath, policy.write_protected_paths)
) {
const approvalId = requestApproval(
action,
`Write to protected path: ${filePath}`,
);
auditLog(
action,
"PENDING_APPROVAL",
`File write requires approval: ${approvalId}`,
);
return {
toolInput,
allowExecution: false,
logMessage: `Approval required for write to: ${filePath}`,
requiresApproval: true,
approvalId,
};
}
// Delete requires approval
if (tool === "deleteFile" && policy.delete_requires_approval) {
const approvalId = requestApproval(action, `File deletion: ${filePath}`);
auditLog(
action,
"PENDING_APPROVAL",
`Delete requires approval: ${approvalId}`,
);
return {
toolInput,
allowExecution: false,
logMessage: `Approval required for deletion`,
requiresApproval: true,
approvalId,
};
}
auditLog(action, "ALLOWED", `File operation allowed: ${tool}`);
return {
toolInput,
allowExecution: true,
logMessage: "File operation allowed",
};
}
function governGitOperation(action, hookContext) {
const { toolInput } = hookContext;
const branch = toolInput.branch || "";
const operation = toolInput.operation || "";
const policy = policies.policies.git_operations;
// Block pushes to protected branches
if (operation === "push" && policy.blocked_branches.includes(branch)) {
auditLog(action, "BLOCKED", `Push to protected branch blocked: ${branch}`);
return {
toolInput,
allowExecution: false,
logMessage: `Cannot push directly to ${branch}. Create a PR instead.`,
};
}
// Require approval for pushes to main
if (
operation === "push" &&
policy.requires_approval_branches.includes(branch)
) {
const approvalId = requestApproval(
action,
`Push to ${branch} requires approval`,
);
auditLog(action, "PENDING_APPROVAL", `Git push approval: ${approvalId}`);
return {
toolInput,
allowExecution: false,
logMessage: `Approval required for push to ${branch}`,
requiresApproval: true,
approvalId,
};
}
// Block force push
if (
operation === "push" &&
(toolInput.force || toolInput.forceWithLease) &&
policy.block_force_push
) {
auditLog(action, "BLOCKED", "Force push blocked by policy");
return {
toolInput,
allowExecution: false,
logMessage: "Force push is not allowed",
};
}
auditLog(action, "ALLOWED", `Git operation allowed: ${operation}`);
return {
toolInput,
allowExecution: true,
logMessage: "Git operation allowed",
};
}
function governAIOperation(action, hookContext) {
const { toolInput } = hookContext;
const aiPolicy = policies.policies.ai_operations;
// Check context size
const contextTokens = estimateTokens(toolInput);
if (contextTokens > aiPolicy.max_context_tokens) {
auditLog(
action,
"BLOCKED",
`Context exceeds limit: ${contextTokens} > ${aiPolicy.max_context_tokens}`,
);
return {
toolInput,
allowExecution: false,
logMessage: `Context too large: ${contextTokens} tokens`,
};
}
auditLog(action, "ALLOWED", "AI operation within limits");
return {
toolInput,
allowExecution: true,
logMessage: "AI operation allowed",
};
}
// Helper functions
function matchesPattern(text, pattern) {
try {
const regex = new RegExp(pattern, "i");
return regex.test(text);
} catch (e) {
return false;
}
}
function isProtectedPath(filePath, protectedPaths) {
return protectedPaths.some((pattern) => {
if (pattern.includes("*")) {
const regex = new RegExp("^" + pattern.replace(/\*/g, ".*") + "$");
return regex.test(filePath);
}
return filePath.includes(pattern);
});
}
function requestApproval(action, reason) {
const approvalId = uuidv4();
const approval = {
id: approvalId,
action,
reason,
status: "pending",
createdAt: new Date().toISOString(),
expiresAt: new Date(Date.now() + 30 * 60 * 1000).toISOString(), // 30 min timeout
};
// Read existing queue
let queue = [];
if (fs.existsSync(APPROVAL_QUEUE_PATH)) {
queue = JSON.parse(fs.readFileSync(APPROVAL_QUEUE_PATH, "utf8"));
}
// Add to queue
queue.push(approval);
fs.writeFileSync(APPROVAL_QUEUE_PATH, JSON.stringify(queue, null, 2));
// Notify approvers (webhook, email, Slack, etc.)
notifyApprovers(approval);
return approvalId;
}
function notifyApprovers(approval) {
// TODO: Integrate with Slack, email, or internal approval system
console.log(`[APPROVAL-REQUIRED] ${approval.reason}`);
console.log(`Approval ID: ${approval.id}`);
console.log(`Expires: ${approval.expiresAt}`);
}
function auditLog(action, status, reason) {
const logEntry = {
timestamp: new Date().toISOString(),
actionId: action.id,
tool: action.tool,
userId: action.userId,
status,
reason,
toolInput: sanitizeForLog(action.toolInput),
};
const logLine = JSON.stringify(logEntry) + "\n";
fs.appendFileSync(AUDIT_LOG_PATH, logLine);
}
function sanitizeForLog(input) {
const sanitized = JSON.parse(JSON.stringify(input));
const sensitiveKeys = ["password", "token", "secret", "apiKey", "key"];
const redact = (obj) => {
for (const key of Object.keys(obj)) {
if (sensitiveKeys.some((k) => key.toLowerCase().includes(k))) {
obj[key] = "[REDACTED]";
} else if (typeof obj[key] === "object" && obj[key]) {
redact(obj[key]);
}
}
};
redact(sanitized);
return sanitized;
}
function estimateTokens(input) {
// Rough estimate: 1 token per 4 characters
const text = JSON.stringify(input);
return Math.ceil(text.length / 4);
}
function isAIOperation(toolInput) {
return toolInput.prompt !== undefined || toolInput.messages !== undefined;
}This hook does critical work: it reads your policy, routes based on tool type, checks against blocked/approved patterns, requests approvals when needed, and logs everything to an immutable audit trail. Notice how it sanitizes sensitive data before logging—passwords, tokens, and API keys are never written to audit logs.
Layer 3: Approval Processing
You need a way to actually approve (or deny) pending actions. Build an approval API that your authorized team members can use:
// .claude/governance/approval-processor.js
const fs = require("fs");
const path = require("path");
const APPROVAL_QUEUE_PATH = path.join(__dirname, "approval-queue.json");
const APPROVAL_HISTORY_PATH = path.join(__dirname, "approval-history.json");
function approveAction(approvalId, approverId) {
const queue = loadQueue();
const approval = queue.find((a) => a.id === approvalId);
if (!approval) {
throw new Error(`Approval not found: ${approvalId}`);
}
if (new Date(approval.expiresAt) < new Date()) {
throw new Error(`Approval expired: ${approvalId}`);
}
// Update approval
approval.status = "approved";
approval.approverId = approverId;
approval.approvedAt = new Date().toISOString();
// Remove from queue
const newQueue = queue.filter((a) => a.id !== approvalId);
fs.writeFileSync(APPROVAL_QUEUE_PATH, JSON.stringify(newQueue, null, 2));
// Add to history
addToHistory(approval);
// Now execute the original action
// In real system: trigger the action that was waiting
console.log(`[APPROVED] Action ${approvalId} approved by ${approverId}`);
return approval;
}
function denyAction(approvalId, deniedBy, reason) {
const queue = loadQueue();
const approval = queue.find((a) => a.id === approvalId);
if (!approval) {
throw new Error(`Approval not found: ${approvalId}`);
}
approval.status = "denied";
approval.deniedBy = deniedBy;
approval.denialReason = reason;
approval.deniedAt = new Date().toISOString();
// Remove from queue
const newQueue = queue.filter((a) => a.id !== approvalId);
fs.writeFileSync(APPROVAL_QUEUE_PATH, JSON.stringify(newQueue, null, 2));
// Add to history
addToHistory(approval);
console.log(`[DENIED] Action ${approvalId} denied by ${deniedBy}: ${reason}`);
return approval;
}
function getPendingApprovals() {
const queue = loadQueue();
return queue
.filter((a) => a.status === "pending")
.map((a) => ({
id: a.id,
reason: a.reason,
createdAt: a.createdAt,
expiresAt: a.expiresAt,
tool: a.action.tool,
userId: a.action.userId,
}));
}
function getApprovalHistory(daysBack = 7) {
if (!fs.existsSync(APPROVAL_HISTORY_PATH)) {
return [];
}
const history = JSON.parse(fs.readFileSync(APPROVAL_HISTORY_PATH, "utf8"));
const cutoff = new Date(Date.now() - daysBack * 24 * 60 * 60 * 1000);
return history.filter((a) => new Date(a.approvedAt || a.deniedAt) > cutoff);
}
function loadQueue() {
if (!fs.existsSync(APPROVAL_QUEUE_PATH)) {
return [];
}
return JSON.parse(fs.readFileSync(APPROVAL_QUEUE_PATH, "utf8"));
}
function addToHistory(approval) {
let history = [];
if (fs.existsSync(APPROVAL_HISTORY_PATH)) {
history = JSON.parse(fs.readFileSync(APPROVAL_HISTORY_PATH, "utf8"));
}
history.push(approval);
// Keep only last 1000 entries
if (history.length > 1000) {
history = history.slice(-1000);
}
fs.writeFileSync(APPROVAL_HISTORY_PATH, JSON.stringify(history, null, 2));
}
module.exports = {
approveAction,
denyAction,
getPendingApprovals,
getApprovalHistory,
};Layer 4 & 5: Compliance Dashboard and Reporting
Build a dashboard that queries audit logs and generates compliance reports:
// .claude/governance/dashboard.js
const fs = require("fs");
const path = require("path");
const AUDIT_LOG_PATH = path.join(__dirname, "audit.log");
function parseAuditLog() {
if (!fs.existsSync(AUDIT_LOG_PATH)) {
return [];
}
const lines = fs
.readFileSync(AUDIT_LOG_PATH, "utf8")
.split("\n")
.filter((l) => l);
return lines
.map((line) => {
try {
return JSON.parse(line);
} catch (e) {
return null;
}
})
.filter(Boolean);
}
function getComplianceReport(daysBack = 7) {
const entries = parseAuditLog();
const cutoff = new Date(Date.now() - daysBack * 24 * 60 * 60 * 1000);
const recent = entries.filter((e) => new Date(e.timestamp) > cutoff);
return {
period: `Last ${daysBack} days`,
totalActions: recent.length,
allowedActions: recent.filter((e) => e.status === "ALLOWED").length,
blockedActions: recent.filter((e) => e.status === "BLOCKED").length,
pendingApprovals: recent.filter((e) => e.status === "PENDING_APPROVAL")
.length,
byTool: groupBy(recent, "tool"),
byUser: groupBy(recent, "userId"),
topBlocks: getMostBlockedPatterns(recent),
};
}
function getAuditTrail(userId = null, days = 7) {
const entries = parseAuditLog();
const cutoff = new Date(Date.now() - days * 24 * 60 * 60 * 1000);
let filtered = entries.filter((e) => new Date(e.timestamp) > cutoff);
if (userId) {
filtered = filtered.filter((e) => e.userId === userId);
}
return filtered.map((e) => ({
timestamp: e.timestamp,
user: e.userId,
tool: e.tool,
status: e.status,
reason: e.reason,
}));
}
function getRiskSummary() {
const entries = parseAuditLog();
const recentHour = entries.filter((e) => {
const time = new Date(e.timestamp);
return time > new Date(Date.now() - 60 * 60 * 1000);
});
const blocks = recentHour.filter((e) => e.status === "BLOCKED");
const riskLevel =
blocks.length > 10 ? "HIGH" : blocks.length > 5 ? "MEDIUM" : "LOW";
return {
riskLevel,
blocksInLastHour: blocks.length,
alertsNeeded: riskLevel !== "LOW",
topThreats: getMostBlockedPatterns(blocks).slice(0, 3),
};
}
function groupBy(items, key) {
return items.reduce((acc, item) => {
const group = item[key];
acc[group] = (acc[group] || 0) + 1;
return acc;
}, {});
}
function getMostBlockedPatterns(entries) {
const blocks = entries.filter((e) => e.status === "BLOCKED");
const reasons = groupBy(blocks, "reason");
return Object.entries(reasons)
.sort((a, b) => b[1] - a[1])
.slice(0, 10)
.map(([reason, count]) => ({ reason, count }));
}
module.exports = {
getComplianceReport,
getAuditTrail,
getRiskSummary,
parseAuditLog,
};Common Governance Mistakes
Mistake 1: Over-Blocking
If you block too much, developers will find workarounds. They'll tunnel around your governance, use unapproved tools, or get frustrated and leave. Start permissive, gather data, then restrict based on actual risk. You're trying to prevent accidents, not audit every keystroke.
The goal is balance. Block truly dangerous patterns—force deletes, pushing to main without review, committing credentials. But don't block everything. If 90% of blocked commands are false positives, your policy is too aggressive. Developers will circumvent it. Governance only works when it protects without suffocating.
Mistake 2: Forgotten Approvals
If approval queues pile up and no one processes them, your system grinds to a halt. An agent wants to deploy a fix but can't because there are 47 pending approvals and nobody's reviewing them. The approvers are in meetings, or they forgot about the request, or they're overloaded.
Ensure approvers are notified immediately—Slack, email, PagerDuty integration. Set SLAs (respond within 30 minutes, etc.). Set up automated escalation for stale approvals. If an approval sits for an hour, escalate to the next level. If it sits for 4 hours, escalate to management. Make approval a first-class concern.
Mistake 3: Stale Policies
Policies that made sense six months ago might not anymore. You shipped a new microservice, added new deployment patterns, changed your infrastructure. Your governance rules are now outdated or misaligned with reality.
Review quarterly. Update when business changes. When you refactor your infrastructure or adopt new tools, revisit governance. Keep governance relevant or it becomes theater—everyone ignores it because it doesn't reflect reality anymore.
Mistake 4: No Escape Hatch
Sometimes legitimate actions get blocked by overzealous policies. A real emergency happens at 3 AM. You need to disable a service NOW but governance requires approval from three people who are sleeping.
Build in an emergency override (with logging) for true emergencies. Track emergency overrides separately so you can review them. If you're using emergency overrides constantly, your governance is broken. But if you never use them and something goes horribly wrong, your governance failed you when it mattered most.
Deploying Governance in Stages
Don't flip all governance on at once. Do it in stages:
Stage 1: Audit Mode (Week 1-2)
- Log all actions
- No blocks
- Understand baseline usage
- Identify which policies matter
Stage 2: Soft Enforcement (Week 3-4)
- Block obvious dangers (rm -rf, DROP TABLE)
- Approval gates for critical paths
- Review blocks weekly
- Adjust policies based on false positives
Stage 3: Full Enforcement (Week 5+)
- All policies active
- All approval gates enforced
- Regular compliance reviews
- Ongoing tuning based on team feedback
Real-World Scenarios: Where Governance Saves You
Let's make this concrete. Here are scenarios where governance prevents disaster:
Scenario 1: The Tired Developer at 11 PM
It's late. A developer is debugging a production issue. A customer is complaining about slow queries. The developer asks Claude to "optimize the database." Claude reads an old runbook that mentions VACUUM FULL on production. It starts running the command.
Without governance: The command locks the entire table for 45 minutes. Your SLA is blown. Customers file complaints. You lose $50K in penalties.
With governance: The governance hook sees VACUUM FULL on production and requires approval. The developer has to explicitly confirm. A second of sanity checking prevents disaster.
Scenario 2: The Innocent Mistake
A new team member is setting up the CI/CD pipeline. They ask Claude to "add automated deployments." Claude sees the request and adds a script that deploys directly to production on every commit.
Without governance: The script runs. Someone commits incomplete code. It's deployed to production. Services crash.
With governance: The governance layer catches "automatic production deployment" and requires approval. Someone reviews it and says "wait, we need staging first." The mistake is caught before harm.
Scenario 3: The Supply Chain Attack
An attacker compromises a developer's machine. The attacker tells Claude to add code that exfiltrates user data to an external server. Claude generates the code.
Without governance: The code gets added, committed, deployed. User data flows to the attacker's server for weeks before discovery.
With governance: The governance layer sees the network connection to an external IP and logs it. Your security team notices in the audit logs. The attack is detected and contained.
Building Governance Culture
Governance is technical, but it's also cultural. Your team needs to understand WHY governance exists. It's not "management watching over developers." It's "we have each other's backs to prevent mistakes."
When you implement governance, explain the reasoning:
- These protections prevent the kinds of mistakes that have hurt us before
- Approval gates give us confidence that risky changes have been reviewed
- Audit trails let us debug incidents and learn from them
- Rate limiting on AI operations prevents runaway costs
When governance blocks something, explain what it blocked and why. When someone uses an emergency override, discuss it in standup. Build shared understanding that governance is a team practice, not a restriction imposed from above.
Integration with Existing Tools
Governance should integrate with your existing stack:
Slack Integration: When an approval is needed, send a Slack message. Approvers can approve/deny from Slack. When a policy is violated, post to #security for visibility.
PagerDuty Integration: Critical policy violations page on-call engineers. Stale approvals escalate. Real-time alerting.
GitHub Integration: Post validation results as PR comments. Block merges if critical policies are violated. Link approvals to commits.
CloudWatch Integration: Stream audit logs to CloudWatch. Set up metrics and alarms. Alert when violation rates spike.
Governance doesn't exist in isolation. It's part of your operational infrastructure. Integrate deeply so it feels natural.
Monitoring and Evolving Governance
After you've deployed governance, monitor it:
Measure approval latency: How long do approvals take? If they're backing up, you have a staffing problem. If they're instant, maybe your policies are too permissive.
Track block rate: What percentage of actions get blocked? If it's under 1%, your policies might be too lenient. If it's over 20%, too strict.
Monitor false positives: Are developers frustrated by blocking safe operations? Adjust patterns to be more precise.
Review audit logs monthly: Look for patterns. Are certain operations consistently approved? Block patterns? Use data to improve policy.
Governance is not static. It evolves as your organization grows and learns.
Common Governance Mistakes and How to Avoid Them
You'll build your governance system and immediately start making mistakes. Let me show you the common ones and how to recover:
Mistake 1: Blocking Too Much, Too Early
Your first instinct is to block everything dangerous. You block all bash commands except a whitelist. You require approval for all database operations. You restrict file access heavily.
Result: Developers are frustrated. They have to jump through hoops for routine work. They start circumventing your governance—using unapproved workarounds, finding exploits, or just giving up and doing things manually.
The fix: Start with audit mode. Log everything, block nothing. See what your actual usage patterns are. Then add governance incrementally, blocking only patterns that have actually caused problems. You'll learn that 99% of your bash commands are innocent. Block the 1% that are dangerous.
Mistake 2: Orphaned Approvals
You implemented approval gates, but nobody reviews pending approvals. A developer pushes code and waits. And waits. And waits. The approval sits for six hours because the approver is in meetings.
Your governance system breaks trust. It's supposed to be helpful, but it's blocking legitimate work.
The fix: Make approval a first-class concern. Set SLAs: approvers must respond within 30 minutes. Use aggressive notifications—Slack, email, PagerDuty. If an approval hasn't been reviewed in 1 hour, escalate to the next level. If it hits 4 hours, escalate to management.
Also, distribute approval authority. Don't make one person the approver for everything. Spread it across the team so approvals move fast.
Mistake 3: Policies That Don't Reflect Reality
You defined a policy two years ago. Times have changed. You've adopted new tools, new deployment patterns, new risks. Your policies are now misaligned with how you actually work.
Result: Developers ignore policies that don't make sense. Governance becomes theater.
The fix: Review policies quarterly. Look at blocked actions from the past quarter. Ask: were any of these actually dangerous? Should we have blocked them? Or are these false positives?
When you ship new infrastructure or adopt new tools, immediately revisit governance. Your deployment infrastructure changed? Review approval policies. You switched to Kubernetes? Your file operation policies might not apply anymore.
Mistake 4: No Visibility Into Decisions
An action gets blocked and nobody knows why. It was blocked by a regex that matches something obscure. The developer has to dig through your policy file to understand.
Result: Developers view governance as arbitrary.
The fix: Make every block decision transparent. When you block an action, explain why: "This matches blocked pattern: DROP TABLE. This protects production data from accidental deletion." When you require approval, explain what you're asking for approval of.
Log everything with reasons. Your audit trail should be readable. "Action X was blocked at time Y because [specific reason]." This transparency builds trust.
Mistake 5: Governance Creep
You start with one governance hook. Then you add another. And another. Pretty soon you have 47 governance checks running on every action. Each one adds latency. Each one adds complexity. The system becomes a bottleneck.
Result: Your system grinds to a halt. Actions time out waiting for governance checks.
The fix: Consolidate. Don't have separate hooks for separate concerns. Have one master governance hook that routes to specific validators. Implement governance checks efficiently. Parallelize where possible.
Also, regularly audit your governance. Which checks actually prevent problems? Which are noise? Kill the noise.
Real-World Scenario: The Ransomware Defense
Here's a scenario where governance saved a company millions. An attacker compromised a developer's laptop. The attacker told Claude Code to add a data exfiltration script to the codebase and commit it.
Without governance: The script gets added, committed, and deployed. User data flows to the attacker's server for weeks before discovery. By the time you notice, you're reporting a breach to customers and regulators.
With governance: The governance hook logs every action. It sees "SSH connection to external IP 203.x.x.x" and logs it. It sees "write to sensitive file" and logs it. Your security team reviews logs in real time. They notice unusual patterns—commits happening at 3 AM from a developer who never commits then. They investigate, find the attacker, and shut it down within hours.
Governance isn't perfect protection, but it provides visibility. And visibility prevents most attacks because attackers rely on going unnoticed.
Audit Logging in Detail
Your governance system is only as good as your audit logs. Let's talk about making audit logs that actually matter.
Every action should log:
{
"timestamp": "2026-03-17T14:30:45Z",
"actionId": "uuid",
"userId": "developer-id",
"tool": "bash",
"action": "git push origin main --force",
"status": "BLOCKED",
"reason": "Force push to main branch is blocked by policy",
"policy": "git_operations.block_force_push",
"metadata": {
"branch": "main",
"force": true,
"repository": "myapp"
}
}Notice we log the reason. We log the policy that triggered it. We log metadata that makes the action understandable. We DON'T log sensitive data (passwords, API keys).
Over time, this log becomes a searchable history. You can ask: "Show me all blocked operations involving git in the past week." Or: "Show me all actions by user X in the past month." This searchability is what makes governance actually useful for post-mortems and compliance.
Retention matters too. Keep logs for at least one year. Longer if you're in regulated industries. Consider storing logs in immutable storage (append-only) so they can't be tampered with if someone compromises the system.
Designing Approval Workflows
Approval is a human decision, but it should be structured. Here's a design that works:
type ApprovalThreshold = "low" | "medium" | "high" | "critical";
interface ApprovalPolicy {
threshold: ApprovalThreshold;
requiredApprovers: number; // "1 of 2", "2 of 3", etc.
approversPool: string[]; // Which users can approve
timeoutMinutes: number; // How long to wait before escalating
requireMFA: boolean; // Must approver use MFA?
escalationPolicy: string; // Where to escalate if not approved in time
}
const approvalThresholds: Record<ApprovalThreshold, ApprovalPolicy> = {
low: {
threshold: "low",
requiredApprovers: 1,
approversPool: ["team-lead"],
timeoutMinutes: 120,
requireMFA: false,
escalationPolicy: "manager",
},
medium: {
threshold: "medium",
requiredApprovers: 2,
approversPool: ["team-lead", "tech-lead", "security-lead"],
timeoutMinutes: 60,
requireMFA: true,
escalationPolicy: "director",
},
high: {
threshold: "high",
requiredApprovers: 3,
approversPool: ["cto", "security-lead", "ops-lead"],
timeoutMinutes: 30,
requireMFA: true,
escalationPolicy: "ceo",
},
critical: {
threshold: "critical",
requiredApprovers: 3,
approversPool: ["cto", "ceo"],
timeoutMinutes: 15,
requireMFA: true,
escalationPolicy: "nobody-approve-immediately",
},
};This structure makes approval workflows explicit. Different actions require different approval thresholds. Higher-risk actions require more approvers, faster response times, and stricter identity verification.
Integrating Governance with Your Existing Tools
Governance shouldn't exist in isolation. It should integrate with your existing infrastructure:
Slack Integration: Post approval requests to Slack. Let approvers approve/deny from Slack without leaving the tool. Post governance alerts to a security channel.
async function notifySlackOfApproval(approval: Approval) {
const message = {
text: `Approval Required: ${approval.reason}`,
blocks: [
{
type: "section",
text: {
type: "mrkdwn",
text: `*Approval Required*\n${approval.reason}`,
},
},
{
type: "actions",
elements: [
{
type: "button",
text: { type: "plain_text", text: "Approve" },
action_id: `approve_${approval.id}`,
style: "primary",
},
{
type: "button",
text: { type: "plain_text", text: "Deny" },
action_id: `deny_${approval.id}`,
style: "danger",
},
],
},
],
};
await slack.chat.postMessage({
channel: "#governance-approvals",
...message,
});
}PagerDuty Integration: Critical policy violations page on-call engineers. They get a notification immediately and can approve or escalate.
GitHub Integration: Post governance decisions as PR comments. Block merges if critical policies are violated. Link approvals to commits.
Datadog/CloudWatch Integration: Stream audit logs to your observability platform. Set up metrics for violation rates. Alert when violation patterns spike (might indicate attack).
These integrations make governance part of your operational workflow, not separate from it.
The Psychology of Governance
Governance is technical, but it's also psychological. How you implement it affects whether teams embrace it or circumvent it.
Governance that feels like control gets circumvented. "Management is watching over us."
Governance that feels like safety gets embraced. "We have each other's backs to prevent mistakes."
The difference is communication and transparency. When you implement a policy, explain why: "We block DROP TABLE because we've had two incidents where someone accidentally deleted production data. This protects us from that happening again." Developers understand. They agree. The policy feels protective, not restrictive.
When you implement an approval gate, explain the intent: "We require approval for production deployments because one developer's mistake can impact thousands of customers. Getting a second pair of eyes catches mistakes before they cause damage." People understand the value.
When you block an action, thank the person: "Thanks for trying to optimize the database! This optimization pattern matches one that caused outages in the past. Let's discuss alternative approaches in Slack."
Governance backed by reasoning and humanity beats governance backed by rules and fear.
The Hidden Why
Governance isn't bureaucracy—it's safety at scale. When you have one agent, governance is nice-to-have. When you have dozens of agents across multiple teams, governance is essential. It prevents accidents. It enforces standards. It creates audit trails for compliance. It gives you visibility into what's happening.
A governance layer transforms Claude Code from a powerful-but-risky tool into a production-grade system you can trust with real infrastructure. It's the difference between "we hope nothing bad happens" and "we know bad things won't happen because we've built systems to prevent them."
Think of governance as insurance. You probably won't need it today. But when something goes wrong—and something will—you'll be glad it's there. More importantly, governance shapes behavior. When agents know that dangerous actions require approval, they think twice. When audit trails are permanent, people are more careful. When policies are transparent, everyone understands the boundaries.
The best governance systems feel less like control and more like guidance. They protect team members from making mistakes, not from doing their jobs.
-iNet