March 10, 2026
Infrastructure DevOps

OpenClaw Backup and Disaster Recovery: Protecting Your Agent's Brain

You know that moment. You're three months deep into training an agent. It's learned your workflows, memorized your preferences, figured out shortcuts that save you hours. And then—hard drive failure, corrupted config, accidental deletion—it's gone. All of it. Back to square one.

I want to make sure that never happens to you.

This article is about protecting what matters: your agent's entire brain. Not just the code or the skills (those are replaceable), but the learned patterns, the memory files, the personalized configuration, the relationships it's built. That's the irreplaceable stuff.

We're going to talk about what to back up, how to automate it, and most importantly, how to recover when things go sideways. Let's start with the stakes.

Table of Contents
  1. What You're Actually Protecting
  2. The Three Backup Tiers
  3. Tier 1: Daily Snapshots (On-Machine)
  4. Tier 2: Weekly Off-Machine Backups (Cloud Storage)
  5. Tier 3: Version Control (Git-Based Workspace Versioning)
  6. Why Three Tiers?
  7. The Recovery Playbook
  8. Scenario 1: Corrupted Config (Happens Today)
  9. Scenario 2: Accidental Deletion of Workspace Files (Happened Yesterday)
  10. Scenario 3: Hard Drive Failure (Happened Last Week)
  11. Scenario 4: Ransomware or Malicious Deletion (The Nightmare)
  12. Understanding Backup Overhead vs. Recovery Cost
  13. Backup Verification: Trust, But Verify
  14. Monitoring: Know When Things Break
  15. The Complete Backup Automation Stack
  16. The Recovery Decision Tree
  17. What Not to Back Up
  18. Conclusion: Paranoia is a Feature

What You're Actually Protecting

Before we get technical, let's be clear about what we're protecting:

Critical Assets:

  1. Workspace files (~/.openclaw/workspace/)

    • SOUL.md (personality/identity)
    • AGENTS.md (subagent definitions)
    • USER.md (your preferences)
    • COMMANDS.md (custom shortcuts)
    • Workspace-specific skills
    • Current state snapshots

These are the living documents of your agent. SOUL.md is particularly precious—it contains the core identity and behavioral patterns that make your agent yours. If you lose this, you're not just regenerating files; you're rebuilding a personality from scratch. That's not a small ask. It's the difference between "I need to restore a file" and "I need to spend three weeks redefining my agent's entire personality."

I've watched people downplay this. They say "oh, I can just rewrite my agent's personality quickly." No you can't. The personality you've built through months of refinement—the way your agent learns from your feedback, the specific tone it uses, the workflows it's internalized—that's institutional knowledge. It's the accumulated result of hundreds of micro-decisions you made while training it. You can get some of it back, sure, but you'll spend weeks noticing subtle differences. Your agent will be close, but not quite right. It'll be like a familiar room that's been rearranged—technically the same space, but something feels off. The stakes here are real.

  1. Memory files (~/.openclaw/memory/)

    • Daily logs (what your agent learned)
    • Fact registry (verified information)
    • Learned patterns (behavioral insights)
    • User-specific memory
    • All of this is trained knowledge that you can't easily recreate

This is the most precious asset. Unlike the code (which you can check out from version control), memory files represent months of accumulated learning. Every interaction, every mistake corrected, every pattern refined. You literally cannot recreate this from source code. If a developer says "we can just regenerate it," they don't understand the value of what they're losing. This is where your agent becomes your agent, customized to your voice and your workflows.

  1. Configuration (~/.openclaw/openclaw.json)

    • Master configuration
    • Skill directories
    • Backup settings
    • Extension definitions
    • API endpoints and runtime settings

This file defines how your entire agent system works. It's the glue connecting all components. Corruption here cascades through everything else. A single typo in this file can bring the whole system down, making every other backup useless because you can't even start the agent to restore from them.

  1. Credentials (~/.openclaw/credentials/)
    • Encrypted API keys
    • Auth tokens
    • Database credentials
    • These are encrypted but still irreplaceable

Losing credentials means you need to regenerate them from each service. Not just tedious—potentially expensive if those services charge per key generation. And some keys can't be recovered at all. I've seen people lose access to services because the only copy of their API key was on a dead hard drive.

Replaceable Assets (can be regenerated):

  • Cached skill results (~/.openclaw/cache/)
  • Compiled skills
  • Downloaded ClawHub packages
  • Temporary files
  • Index databases that can be rebuilt

Here's the brutal truth: if you lose the first group, you're rebuilding from scratch. If you lose the second group, it's an inconvenience. So our backup strategy focuses on protecting group 1 ruthlessly. We're not wasting time and storage backing up things that are easy to regenerate.

The three-tier backup approach I'm about to outline is designed specifically around this distinction. We're not backing up everything equally; we're protecting what matters most with the appropriate redundancy. This is the professional approach: tiered backups where critical assets get the most protection and redundancy.

The Three Backup Tiers

Different needs require different approaches. You don't back up a payroll system the same way you back up a photo library. For OpenClaw, we use a three-tier system that balances cost, complexity, and recovery time:

  • Tier 1: Frequent, local snapshots for quick recovery from everyday mistakes
  • Tier 2: Off-machine backups for hardware failure scenarios
  • Tier 3: Version control for understanding changes and rolling back problematic updates

Let's dive into each one. Each tier solves a different problem, and together they create defense-in-depth protection.

Tier 1: Daily Snapshots (On-Machine)

Purpose: Recover from yesterday's accidental deletion or corruption Frequency: Daily, automated Retention: 7 days Recovery Time: Minutes Cost: Minimal (disk space only)

This is your first line of defense. Every night, a copy of your entire OpenClaw workspace gets compressed and stored locally. It's the fastest recovery mechanism and catches the most common failure scenarios: you deleted something accidentally, a script corrupted a file, configuration got borked during an update.

The beauty of daily snapshots is that they're completely automated. You set them up once and forget about them. No network calls, no cloud services, just a local tar.gz file sitting on your disk. The entire system runs in the background, silently protecting you every night.

Setup:

bash
# Create a backup directory structure
mkdir -p ~/.openclaw/backup/daily
mkdir -p ~/.openclaw/backup/archive
 
# Schedule daily backup via cron (Unix/Mac)
# Add this line to your crontab (crontab -e):
0 2 * * * ~/.openclaw/bin/backup-daily.sh
 
# Or use systemd timer on Linux
# Or Task Scheduler on Windows (see below)

The timing matters. Running at 2 AM means it doesn't interfere with your actual work. The backup window is usually 2-5 minutes, so you want it when the agent isn't actively running and you're asleep.

The backup script looks like this:

bash
#!/bin/bash
# ~/.openclaw/bin/backup-daily.sh
 
BACKUP_DIR="$HOME/.openclaw/backup/daily"
ARCHIVE_DIR="$HOME/.openclaw/backup/archive"
TIMESTAMP=$(date +%Y-%m-%d)
BACKUP_FILE="$BACKUP_DIR/openclaw-$TIMESTAMP.tar.gz"
 
# Create daily backup (compress critical assets only)
tar -czf "$BACKUP_FILE" \
  -C "$HOME" \
  .openclaw/workspace \
  .openclaw/memory \
  .openclaw/openclaw.json \
  .openclaw/credentials
 
# Clean up backups older than 7 days
# This prevents disk space from growing unbounded
find "$BACKUP_DIR" -name "openclaw-*.tar.gz" -mtime +7 -delete
 
# Keep last backup in archive for easy access
cp "$BACKUP_FILE" "$ARCHIVE_DIR/latest-backup.tar.gz"
 
# Log completion (useful for monitoring)
echo "Backup completed: $BACKUP_FILE ($(du -h "$BACKUP_FILE" | cut -f1))" >> "$HOME/.openclaw/backup/backup.log"

The key insight here: we're compressing specifically the files we care about. We're excluding cache and temporary directories to keep the backup size manageable. A typical backup is 50-200MB, depending on memory size. Even with seven daily backups, that's only 350-1400MB—trivial by today's standards.

This is the hidden layer of backup thinking: you could back up everything, but you'd waste storage and slow down recovery. Instead, be intentional. Backup what matters. Let the cache regenerate. This is efficiency.

For Windows (PowerShell):

powershell
# C:\Users\YourName\.openclaw\bin\backup-daily.ps1
 
$BackupDir = "$env:USERPROFILE\.openclaw\backup\daily"
$ArchiveDir = "$env:USERPROFILE\.openclaw\backup\archive"
$Timestamp = Get-Date -Format "yyyy-MM-dd"
$BackupFile = "$BackupDir\openclaw-$Timestamp.zip"
 
# Create backup with compression
$FilesToBackup = @(
    "$env:USERPROFILE\.openclaw\workspace",
    "$env:USERPROFILE\.openclaw\memory",
    "$env:USERPROFILE\.openclaw\openclaw.json",
    "$env:USERPROFILE\.openclaw\credentials"
)
 
# Compress-Archive is built-in, cross-platform
Compress-Archive -Path $FilesToBackup -DestinationPath $BackupFile -Force
 
# Clean up old backups (older than 7 days)
$cutoffDate = (Get-Date).AddDays(-7)
Get-ChildItem "$BackupDir\openclaw-*.zip" | Where-Object { $_.LastWriteTime -lt $cutoffDate } | Remove-Item
 
# Keep a latest copy for easy access
Copy-Item $BackupFile "$ArchiveDir\latest-backup.zip" -Force
 
Write-Host "Backup completed: $BackupFile"

Schedule this with Task Scheduler:

  1. Open Task Scheduler
  2. Create Basic Task
  3. Set trigger to Daily at 2 AM
  4. Set action to run PowerShell script: powershell.exe -ExecutionPolicy Bypass -File C:\Users\YourName\.openclaw\bin\backup-daily.ps1
  5. Check "Run with highest privileges" (needed for system file access)

This setup means you always have at least one backup from the past week. If you accidentally delete something on Wednesday, you can recover from Tuesday's backup. Seven days is long enough to catch most mistakes but short enough that storage isn't a concern.

The psychology of daily backups is important: they protect you from the most common failures, which are the ones you cause. Accidental deletion, borked config edits, experiments that went wrong. These happen frequently. Daily backups mean you recover quickly, often within minutes.

Tier 2: Weekly Off-Machine Backups (Cloud Storage)

Purpose: Survive local hardware failure Frequency: Weekly, automated Retention: 4 weeks (one month) Recovery Time: Hours (depends on file size and connection speed) Cost: Depends on provider (Google Drive/Dropbox are cheap or free for small backups)

Daily backups won't help if your hard drive dies. You need copies elsewhere. This is where a second machine, cloud storage, or a NAS comes in. The idea is simple: once a week, copy your latest daily backup somewhere off-site.

Why weekly instead of daily? Because off-site transfer takes longer and costs more (bandwidth). Weekly is a good balance. If you need more frequent cloud backups, you can adjust, but weekly catches most hardware failure scenarios. The math: if you back up weekly and your drive fails on the day before backup, you lose at most 6 days of data. Most people can tolerate six days of data loss. If you need better, daily cloud backups are possible—just be aware of bandwidth costs.

Option A: Using rsync (to another machine on your network):

bash
#!/bin/bash
# ~/.openclaw/bin/backup-weekly.sh
 
BACKUP_DIR="$HOME/.openclaw/backup/daily"
REMOTE_USER="you"
REMOTE_HOST="backup-server.example.com"
REMOTE_PATH="/backups/openclaw"
 
# Get the most recent backup
LATEST_BACKUP=$(ls -t "$BACKUP_DIR"/openclaw-*.tar.gz | head -1)
 
# Sync latest backup to remote server
# -avz: archive, verbose, compress
# --delete: remove old backups on remote
rsync -avz --delete \
  "$LATEST_BACKUP" \
  "$REMOTE_USER@$REMOTE_HOST:$REMOTE_PATH/"
 
# Keep only last 4 weekly backups on remote (saves space)
ssh "$REMOTE_USER@$REMOTE_HOST" \
  "ls -1t $REMOTE_PATH/openclaw-*.tar.gz | tail -n +5 | xargs rm -f"
 
echo "Weekly backup synced to $REMOTE_HOST at $(date)"

Add to crontab (typically Sunday at 3 AM):

0 3 * * 0 ~/.openclaw/bin/backup-weekly.sh

Rsync is powerful because it only transfers changed data. First time might take a while, but subsequent backups are super fast (probably just adds one file). It's efficient and doesn't waste bandwidth.

Option B: Using rclone (to cloud storage):

rclone is magical for cloud backups. It works with Google Drive, Dropbox, AWS S3, OneDrive, and dozens of services. The advantage: you're using your existing cloud storage, probably free or cheap. Most people already have 100GB+ of unused quota on Google Drive. Why not use it?

bash
#!/bin/bash
# ~/.openclaw/bin/backup-cloud.sh
 
BACKUP_DIR="$HOME/.openclaw/backup/daily"
LATEST_BACKUP=$(ls -t "$BACKUP_DIR"/openclaw-*.tar.gz | head -1)
 
# Upload to Google Drive (or your configured rclone remote)
rclone copy "$LATEST_BACKUP" "gdrive:OpenClaw-Backups/"
 
# Clean up old backups in cloud (keeps only 4 weeks)
rclone delete "gdrive:OpenClaw-Backups" \
  --min-age 28d \
  --include "openclaw-*.tar.gz"
 
echo "Backup synced to Google Drive at $(date)"

First-time rclone setup:

bash
# Interactive configuration
rclone config
 
# You'll be prompted to:
# 1. Name the remote (e.g., "gdrive")
# 2. Choose provider (Google Drive, etc.)
# 3. Authenticate by opening a browser link
# 4. Grant access to your drive
 
# Test connection
rclone ls gdrive:
rclone mkdir gdrive:OpenClaw-Backups

The configuration process is surprisingly smooth. For Google Drive, it opens a browser window, you authorize rclone, and you're done. No API keys to manage, no OAuth headaches. It just works.

Important: Keep your rclone config encrypted. The credentials stored there are valuable—they're basically keys to your cloud storage.

bash
# Encrypt rclone config with GPG
gpg --symmetric ~/.config/rclone/rclone.conf
# Remove the unencrypted version
rm ~/.config/rclone/rclone.conf
 
# Now your backup script needs to decrypt before use:
bash
#!/bin/bash
# Decrypt config before backup
gpg --decrypt ~/.config/rclone/rclone.conf.gpg > ~/.config/rclone/rclone.conf
 
# Run rclone...
BACKUP_DIR="$HOME/.openclaw/backup/daily"
LATEST_BACKUP=$(ls -t "$BACKUP_DIR"/openclaw-*.tar.gz | head -1)
rclone copy "$LATEST_BACKUP" "gdrive:OpenClaw-Backups/"
 
# Remove decrypted config immediately
rm ~/.config/rclone/rclone.conf

This way, your rclone credentials are always encrypted at rest. Only during the backup window is the key decrypted into memory, and only long enough to upload. It's a small extra step but meaningfully more secure.

The off-machine backup solves the "my hardware died" scenario. You get a new laptop, install OpenClaw, download the backup from Google Drive (or your remote server), extract it, and you're back in business with minimal data loss (at most one week of memory). This is the difference between "I need three days to recover" and "I'm back to work in three hours."

Tier 3: Version Control (Git-Based Workspace Versioning)

Purpose: Track changes, enable rollback to specific versions, investigate what changed Frequency: On-demand (you control it, typically after major changes) Retention: Unlimited (git history persists) Recovery Time: Seconds to minutes Cost: Free (if using local git) or cheap (GitHub, GitLab)

The third tier isn't really a backup in the traditional sense. It's version history. Here's the difference: backup/restore is about recovering from disaster. Version control is about understanding what changed and being able to roll back intentional changes that broke things.

Initialize your workspace as a git repository:

bash
cd ~/.openclaw/workspace
git init
git config user.name "OpenClaw-Agent"
git config user.email "agent@example.com"
 
# Create initial commit
git add .
git commit -m "Initial workspace snapshot: $(date +%Y-%m-%d)"

Now whenever you make significant changes, commit them:

bash
# After updating AGENTS.md with new subagents
git add AGENTS.md
git commit -m "Add email-processing subagent"
 
# After major SOUL.md update
git add SOUL.md
git commit -m "Refine agent personality: increase creativity, improve risk awareness"
 
# After testing new skills
git add skills/
git commit -m "Add web-search and document-analysis skills from ClawHub"

Then you have full history:

bash
# View commit history
git log --oneline
 
# See what changed in a specific commit
git show abc1234:SOUL.md
 
# See all changes between two versions
git diff HEAD~5 HEAD
 
# See who changed what (blame)
git blame AGENTS.md
 
# Revert a specific file to a previous version
git checkout HEAD~3 -- SOUL.md
 
# See what files were changed in each commit
git log --name-status --oneline

The killer feature: if you make a risky change and it breaks things, you can revert in seconds. This is where the hidden layer teaching comes in. Many people think of git as just version control, but in the context of backup, it's forensic archaeology. You can see exactly what changed, when, and potentially why (if your commit messages are good).

bash
# Something broke. What happened?
git diff
 
# That change looks bad. Let's see the full history
git log --oneline AGENTS.md
 
# Revert the agent definition to last known good state
git checkout HEAD~2 -- AGENTS.md
 
# Now your agent starts working again

Push to remote (GitHub, GitLab, Gitea, etc.):

bash
# Create repository on GitHub first, then:
git remote add origin https://github.com/yourname/openclaw-workspace.git
git branch -M main
git push -u origin main
 
# Subsequent pushes
git push
 
# Set up automatic daily pushes if desired

Keep this repository private. It contains learned patterns, possibly sensitive configuration, and details about how your agent works. You don't want that on the public internet. Think of this git repo as your agent's source code. You wouldn't make your proprietary code public.

For most people, local git is fine. You have the history on your machine. If you want extra redundancy, push to GitHub (private repo) and you get automatic backups of your git history too. This is defense in depth: your local git history is Tier 1, your GitHub backup is Tier 2, and your off-machine backups are Tier 3. You're protected multiple ways.

Why Three Tiers?

This multi-tier approach isn't overkill. Each tier solves a different problem:

  • Tier 1 (Daily snapshots): Catches accidents and everyday mistakes. Fast recovery (minutes). Requires no planning, just runs automatically.
  • Tier 2 (Weekly cloud): Catches hardware failure, theft, or fire. Slower recovery (hours) but geographic redundancy. Protects you against losing your physical machine.
  • Tier 3 (Git history): Catches intentional changes that went wrong. Lightning-fast recovery, but requires understanding what changed. Prevents you from having to search through backup files to figure out which day things were still good.

Together, they cover every realistic disaster scenario. And they're cheap—total setup time is a couple hours, and ongoing cost is essentially nothing (just disk space and maybe some cloud storage for weekly backups).

The philosophy here matters: you're protecting against different classes of failures. Each tier adds a different layer of resilience. This is how professional systems are designed—multiple, independent protection mechanisms so a single failure point doesn't wipe you out.

The Recovery Playbook

Now, the moment of truth: something goes wrong. Here's how you recover.

Scenario 1: Corrupted Config (Happens Today)

You edited openclaw.json and now your agent won't start. This is probably the most common failure.

Recovery Steps:

  1. Check the latest backup:
bash
ls -lah ~/.openclaw/backup/daily/
# You should see files like:
# openclaw-2026-03-17.tar.gz
# openclaw-2026-03-16.tar.gz
# etc.
  1. Restore the config file only (fastest recovery):
bash
# Extract just the config from yesterday's backup
tar -xzf ~/.openclaw/backup/daily/openclaw-2026-03-16.tar.gz \
  -C / \
  .openclaw/openclaw.json
 
# Or to a temp location first for inspection
mkdir /tmp/restore
tar -xzf ~/.openclaw/backup/daily/openclaw-2026-03-16.tar.gz \
  -C /tmp/restore \
  .openclaw/openclaw.json
 
# Compare the old config to your current one
diff /tmp/restore/.openclaw/openclaw.json ~/.openclaw/openclaw.json
  1. Validate the restored config:
bash
openclaw config validate
# If this passes, you're good to restart
  1. Restart your agent:
bash
openclaw restart

If you're using git:

bash
cd ~/.openclaw/workspace
git log --follow -- ../openclaw.json
# See the history of config changes
git show HEAD~1:../openclaw.json > /tmp/openclaw.json.good
# Compare it to current
diff /tmp/openclaw.json.good ~/.openclaw/openclaw.json

Time to recover: 2-5 minutes

This is the bread-and-butter recovery scenario. Most of the time, you just restore yesterday's config and you're back in business. The key is having that backup there, waiting.

Scenario 2: Accidental Deletion of Workspace Files (Happened Yesterday)

You deleted your SOUL.md and can't remember what it said. The agent's not acting right.

Recovery Steps:

  1. List available backups:
bash
ls -1 ~/.openclaw/backup/daily/*.tar.gz
# Shows you which dates are available
  1. Figure out when you deleted the file. If you can't remember, check git first:
bash
cd ~/.openclaw/workspace
git log --follow SOUL.md
# Shows you when it was last committed
# If you committed this week, git is your friend
  1. If not in git, extract from a daily backup:
bash
mkdir /tmp/restore
# Try today's backup
tar -xzf ~/.openclaw/backup/daily/openclaw-2026-03-17.tar.gz \
  -C /tmp/restore
 
# Check if the file is there
ls -la /tmp/restore/.openclaw/workspace/SOUL.md
 
# If it's already deleted in today's backup, try yesterday
tar -xzf ~/.openclaw/backup/daily/openclaw-2026-03-16.tar.gz \
  -C /tmp/restore
  1. Review the recovered file:
bash
cat /tmp/restore/.openclaw/workspace/SOUL.md
# Make sure it looks right before restoring
  1. Restore it:
bash
cp /tmp/restore/.openclaw/workspace/SOUL.md ~/.openclaw/workspace/
  1. Verify everything still works:
bash
openclaw status
openclaw test
# Run a quick sanity check

Time to recover: 5-10 minutes

The key here is not panicking. You have seven days of backups. Even if you don't notice the deletion for three days, you've got four days of backups to search through. This is the safety margin that backup redundancy provides.

Scenario 3: Hard Drive Failure (Happened Last Week)

Your machine died. You have a new one. You need everything back.

Recovery Steps:

  1. Install OpenClaw on new machine:
bash
curl https://install.openclaw.io | bash
# or however your installation works
  1. Download latest backup from cloud:
bash
# If using rclone
rclone copy "gdrive:OpenClaw-Backups/openclaw-2026-03-16.tar.gz" ~/
 
# Or download from your backup server
scp you@backup-server.example.com:/backups/openclaw/openclaw-2026-03-16.tar.gz ~/
 
# Or manually download from Google Drive/Dropbox
  1. Extract to home directory:
bash
cd ~
tar -xzf openclaw-2026-03-16.tar.gz
# This unpacks everything to ~/.openclaw/
  1. Verify restoration:
bash
openclaw status
# Should show everything is loaded
openclaw memory stats
# Should show your accumulated memory
  1. Test a skill to make sure everything works:
bash
openclaw skill list
# Should show your skills
 
openclaw skill test web-search
# Run a quick test on a skill
  1. Check that git history is restored (if you used it):
bash
cd ~/.openclaw/workspace
git log --oneline
# You should see your full history

Time to recover: 15-30 minutes (plus download time)

This is the big one, but it's remarkably straightforward. You're literally just restoring a tar.gz file. Modern machines are fast; extraction is not the bottleneck. The bottleneck is probably downloading the file from cloud storage, which depends on your internet speed.

Pro tip: test this recovery process once a year. Actually restore a backup to a test machine to make sure everything works. I know it sounds paranoid, but untested backups are just wasted storage. Test it and you'll have confidence when you really need it. This is where the hidden layer thinking matters—testing your backups is not paranoia, it's the difference between a backup system that works and a backup system that fails when you need it most.

Scenario 4: Ransomware or Malicious Deletion (The Nightmare)

Your agent has been compromised. Multiple backup tiers give you options. This is actually the scenario where git history saves you.

Recovery Plan:

  1. Immediate: Stop the agent from running and isolate it
bash
pkill -f openclaw
# Make sure it's really stopped
ps aux | grep openclaw
  1. Assess the damage: What was actually modified?
bash
cd ~/.openclaw/workspace
git status
# Shows modified files
 
git diff
# Shows exact changes made
  1. Review git history for suspicious commits:
bash
git log --pretty=fuller --all
# See all commits with full details
 
git show <suspicious-commit>
# View the exact changes in a commit
  1. Restore from known-good version:
bash
# Find the last good commit (before the compromise)
git log --oneline
# Say the compromise happened in commit abc1234
 
# Go back to commit before that
git reset --hard <good-commit-hash>
 
# This restores all files to that point
  1. Rebuild memory from backup:
bash
# Extract memory from a backup from before the compromise
tar -xzf ~/.openclaw/backup/daily/openclaw-2026-03-10.tar.gz \
  .openclaw/memory
  1. Rotate credentials immediately:
bash
# Your API keys might be compromised
openclaw credentials rotate all
 
# Regenerate all API keys from scratch
# This takes a bit of work but is essential
  1. Check for persistence (how did they get in?):
bash
# Look for unusual cron jobs
crontab -l
systemctl list-timers
 
# Check for unknown SSH keys
cat ~/.ssh/authorized_keys
 
# Look for unusual processes
ps auxww | sort
 
# Check logs for suspicious activity
tail -100 /var/log/auth.log # Linux
# or system logs on Mac/Windows
  1. Harden your system:
bash
# Change your passwords
# Enable two-factor authentication everywhere
# Update OpenClaw and all dependencies
openclaw self-update
 
# Review firewall rules
sudo iptables -L # Linux
# or Windows Defender settings
 
# Consider moving backups to read-only storage
# This prevents ransomware from touching them

Time to recover: 1-2 hours (depending on severity and forensic investigation)

This scenario is why I say backup paranoia is actually professional. If ransomware hits and you can recover from git history, you're in vastly better shape than someone without version control. The git history tells you exactly what was compromised, which is often more useful than just having the files back. You can investigate how the compromise happened.

The key insight: git history lets you see exactly what changed, which is often more useful than just having the files back. You can investigate how the compromise happened, which helps you prevent it in the future.

Understanding Backup Overhead vs. Recovery Cost

Here's something people don't think about clearly: the cost of running backups versus the cost of not having backups. When you're thinking about whether to implement a backup system, you're tempted to calculate the overhead—extra disk space, bandwidth for cloud backups, time to maintain the scripts. But you're not calculating the full equation.

Running a three-tier backup system costs you:

  • Daily snapshots: about 100-300MB disk space (7 days of rolling backups)
  • Weekly cloud backups: maybe 50-100MB of bandwidth per week (assuming your backups are 100-200MB)
  • Version control: minimal overhead (git is incredibly efficient)
  • Total ongoing time: maybe 30 minutes a month for checking health and testing

Not having a backup system costs you:

  • If something goes wrong: a hard drive failure means you lose everything. Start from scratch. We're talking weeks of work to rebuild your agent's personality, memory, configuration. And that's if everything else on your system is also lost—if you have other critical files, multiply the impact.
  • If it's ransom attack: you're either paying the ransom or losing everything. There's no middle ground.
  • If it's accidental deletion: you spend hours or days trying to recover data from unallocated sectors, and you might not succeed.

The math is clear. The overhead of a backup system is negligible compared to the cost of losing everything. And I'm not just talking about financial cost. I'm talking about the months of accumulated learning, the relationships your agent has built with your workflows, the muscle memory you've developed using it.

This is why professional organizations don't debate whether to implement backups. They debate how many tiers, what retention period, and how to test them. The question "should we backup?" is already answered. Of course you should. The only discussion is execution details.

Think of backup overhead like insurance premiums. Yes, you're paying money now to protect against a future cost. And hopefully you never need it. But when you do need it—and someday you will—you'll be incredibly grateful you paid the premium.

Backup Verification: Trust, But Verify

Here's the uncomfortable truth: backups are only useful if they actually work. Untested backups are just wasted storage. I've heard horror stories where someone needed a backup, tried to restore it, and discovered the backups had been corrupted for months. Don't be that person.

Monthly backup test:

bash
#!/bin/bash
# ~/.openclaw/bin/test-backup.sh
 
# Pick a backup at random
BACKUP=$(ls ~/.openclaw/backup/daily/*.tar.gz | shuf | head -1)
TEST_DIR="/tmp/backup-test-$$"
 
echo "Testing backup: $BACKUP"
 
# Extract to temp directory
mkdir -p "$TEST_DIR"
 
# List contents first (quick sanity check)
tar -tzf "$BACKUP" | head -20
 
# Actually extract
tar -xzf "$BACKUP" -C "$TEST_DIR"
 
# Verify critical files exist
for file in .openclaw/workspace/SOUL.md \
            .openclaw/workspace/AGENTS.md \
            .openclaw/openclaw.json \
            .openclaw/memory/index.json; do
    if [ ! -f "$TEST_DIR/$file" ]; then
        echo "ERROR: Missing $file in backup"
        exit 1
    fi
done
 
# Verify JSON validity (config files should be valid JSON)
python3 -m json.tool "$TEST_DIR/.openclaw/openclaw.json" > /dev/null || {
    echo "ERROR: openclaw.json is corrupted"
    exit 1
}
 
# Verify tar file integrity
tar -tzf "$BACKUP" > /dev/null 2>&1 || {
    echo "ERROR: tar file is corrupted"
    exit 1
}
 
# Check file counts match (rough sanity check)
FILE_COUNT=$(tar -tzf "$BACKUP" | wc -l)
echo "Backup contains $FILE_COUNT files"
 
# Clean up
rm -rf "$TEST_DIR"
 
echo "Backup test passed! ✓"

Run this monthly:

bash
# Add to crontab (monthly, say first Sunday at 3 AM)
0 3 * * 0 ~/.openclaw/bin/test-backup.sh >> ~/.openclaw/backup/test-log.txt 2>&1

This test does several things:

  1. Picks a random backup (so you're testing different files over time)
  2. Verifies it can be extracted
  3. Checks that critical files exist
  4. Validates JSON integrity (catches corrupted configs early)
  5. Logs results so you have an audit trail

If the test fails, you know immediately. Fix it then, not when you actually need the backup. This is the difference between paranoia and professionalism. Professionals test their backup systems. Everyone else discovers their backups don't work at the worst possible time.

Monitoring: Know When Things Break

You can't recover from a backup you didn't know existed. Set up monitoring so you're alerted if backups stop working:

bash
#!/bin/bash
# ~/.openclaw/bin/backup-health-check.sh
 
BACKUP_DIR="$HOME/.openclaw/backup/daily"
LATEST_BACKUP=$(ls -t "$BACKUP_DIR"/*.tar.gz 2>/dev/null | head -1)
 
# Check if backup exists
if [ -z "$LATEST_BACKUP" ]; then
    echo "ERROR: No backups found!"
    # Send alert
fi
 
# Check how old the latest backup is
CURRENT_TIME=$(date +%s)
LATEST_TIME=$(stat -f %m "$LATEST_BACKUP" 2>/dev/null || stat -c %Y "$LATEST_BACKUP")
HOURS_OLD=$(( ($CURRENT_TIME - $LATEST_TIME) / 3600 ))
 
# Alert if no backup in 48 hours (more than 1 day late)
if [ $HOURS_OLD -gt 48 ]; then
    echo "WARNING: Last backup is $HOURS_OLD hours old"
    # Send email, Slack notification, etc.
fi
 
# Check disk space (if backups fill the disk, they stop working)
BACKUP_SIZE=$(du -sh "$BACKUP_DIR" | cut -f1)
DISK_FREE=$(df -h "$BACKUP_DIR" | awk 'NR==2 {print $4}')
DISK_PERCENT=$(df "$BACKUP_DIR" | awk 'NR==2 {print $5}' | sed 's/%//')
 
if [ $DISK_PERCENT -gt 80 ]; then
    echo "WARNING: Backup disk is $DISK_PERCENT% full"
fi
 
echo "Backup health: Size=$BACKUP_SIZE, Free=$DISK_FREE, Age=${HOURS_OLD}h"

Add notifications if something's wrong:

bash
# Send Slack notification on failure
backup_alert() {
    local message=$1
    curl -X POST -H 'Content-type: application/json' \
        --data "{\"text\":\"OpenClaw Backup Alert: $message\"}" \
        https://hooks.slack.com/services/YOUR/WEBHOOK/URL
}
 
# Send email
backup_email_alert() {
    local subject=$1
    local message=$2
    echo "$message" | mail -s "$subject" you@example.com
}

Run health checks daily:

bash
# Add to crontab
0 4 * * * ~/.openclaw/bin/backup-health-check.sh >> ~/.openclaw/backup/health.log 2>&1

This way, if backups stop working (because your disk filled up, or cron job failed, or something else), you find out immediately. You can fix it before you actually need the backup. This is proactive maintenance—seeing problems before they become disasters.

The Complete Backup Automation Stack

Here's everything together—a complete, production-ready backup system:

yaml
# ~/.openclaw/backup/backup-config.yaml
 
backup_tiers:
  daily:
    enabled: true
    schedule: "2 * * * *" # Every day at 2 AM
    retention_days: 7
    location: ~/.openclaw/backup/daily
    compression: gzip
    includes:
      - ~/.openclaw/workspace
      - ~/.openclaw/memory
      - ~/.openclaw/openclaw.json
      - ~/.openclaw/credentials
    excludes:
      - ~/.openclaw/cache
      - ~/.openclaw/workspace/.git/objects # Don't double-backup git
 
  weekly:
    enabled: true
    schedule: "0 3 * * 0" # Sunday at 3 AM
    retention_weeks: 4
    location: gdrive:OpenClaw-Backups
    method: rclone
    compression: gzip
    requires: daily # Depends on daily backup existing
 
  git:
    enabled: true
    location: ~/.openclaw/workspace
    auto_commit: true
    auto_commit_schedule: "30 * * * *" # Every 30 minutes
    auto_push: true
    push_schedule: "30 3 * * *" # Daily at 3:30 AM
 
health_checks:
  enabled: true
  schedule: "0 4 * * *" # Daily at 4 AM
  test_schedule: "0 3 * * 0" # Monthly test, first Sunday
  notifications:
    - type: slack
      webhook: "${SLACK_WEBHOOK_URL}"
    - type: email
      address: "you@example.com"
    - type: log
      location: ~/.openclaw/backup/health.log

This configuration defines your entire backup strategy. It's declarative: what gets backed up, when, where, and how to verify it works. This is the professional approach—codified, reproducible, auditable.

The Recovery Decision Tree

When disaster strikes, ask yourself this:

Is your machine still running?
├─ Yes, file corruption
│  └─ Check daily backup (Tier 1)
│     └─ Restore specific file
│     └─ Recovery time: 5 min
│
├─ No, machine won't boot
│  └─ Get new hardware
│     └─ Restore from cloud (Tier 2)
│     └─ Recovery time: 30 min + download
│
├─ File deleted recently
│  └─ Check git history (Tier 3)
│     └─ Revert to known-good commit
│     └─ Recovery time: 1 min
│
└─ Suspected compromise
   └─ Restore from clean backup (Tier 2)
   └─ + Review git for malicious commits
   └─ + Rotate all credentials
   └─ Recovery time: 1-2 hours

Every scenario has an escape hatch. You just need to know where it is before you need it. This decision tree encodes the strategy. When things go wrong, consult this tree and you know exactly what to do. There's no panic, just a clear path to recovery.

The three-tier approach ensures that no matter what goes wrong, you have a path to recovery. And that path is documented, tested, and automated.

What Not to Back Up

A quick note on what you don't need to back up:

  • Cache files (~/.openclaw/cache/): Regenerates automatically next time you use the agent
  • Downloaded skills: Can be re-downloaded from ClawHub or skill repositories
  • Compiled output: Regenerates on demand when the agent runs
  • Temporary files: By definition temporary—deleting them won't hurt
  • .git/objects in backup: Don't need git packed objects in your tar backup; git itself handles this

Backing these up wastes storage and slows down recovery. Focus on the irreplaceable stuff: workspace, memory, config, and credentials. This is intentional design—every byte matters when you're dealing with backups and recovery.

The distinction between what to back up and what to skip is fundamentally about replaceability. If something can be regenerated in minutes or seconds, it's not worth backup space. If it represents months of learning or configuration effort, it's essential. This is why cache is excluded but SOUL.md is included. You can regenerate a cache. You cannot regenerate the accumulated learning and personality of your agent.

This also has practical implications for recovery speed. Every file you back up is a file you have to restore. If you're backing up 500MB of data that includes 400MB of cache that will auto-regenerate anyway, you're wasting precious recovery time. The real critical assets—your workspace, memory, and config—might only be 50MB. That backs up in seconds, restores in seconds. The extra 450MB of cache? That just slows you down when you need to be fast.

The mindset here is worth internalizing: be intentional about what's valuable. Not everything is. When you make that distinction consciously and explicitly, your backup system becomes lean, fast, and efficient. That efficiency translates directly to faster recovery when disaster strikes.

Conclusion: Paranoia is a Feature

Good backup practices aren't paranoid—they're professional. You wouldn't deploy a production service without backups. Your OpenClaw agent is your production service. It's your working memory, your extended cognition, your digital assistant. Losing it is losing productivity and months of accumulated learning.

The three-tier system I've outlined takes maybe two hours to set up once, then runs silently in the background. For that investment, you get:

  • Daily protection against accidents (5 minutes to recover)
  • Weekly protection against hardware failure (30 minutes to recover)
  • Version history for investigating what changed (1 minute to revert)
  • Tested, verified backups that actually work
  • Peace of mind knowing you're protected

Setup takes a weekend. Benefits compound forever. In three months, you'll be so glad you did this.

Do it today. Future you—the one staring at a blank screen after a hard drive failure or ransomware attack—will be incredibly grateful.

Your agent's brain is too valuable to lose.


-iNet

Need help implementing this?

We build automation systems like this for clients every day.

Discuss Your Project