# Docker Maintenance Guide for Local Development ## The Problem When developing with Tilt and local Kubernetes (Kind), Docker accumulates: - **Multiple image versions** from each code change (Tilt rebuilds) - **Unused volumes** from previous cluster runs - **Build cache** that grows over time This quickly fills up disk space, causing pods to fail with "No space left on device" errors. ## Quick Fix (When You Hit Disk Issues) ```bash # Clean up all unused Docker resources docker system prune -a --volumes -f ``` This removes: - All unused images - All unused volumes - All build cache **Expected recovery**: 60-100GB ## Regular Maintenance ### Option 1: Use the Cleanup Script (Recommended) Run the maintenance script weekly: ```bash ./scripts/cleanup-docker.sh ``` Or run it automatically without confirmation: ```bash ./scripts/cleanup-docker.sh --auto ``` ### Option 2: Manual Commands ```bash # Remove images older than 24 hours docker image prune -af --filter "until=24h" # Remove unused volumes docker volume prune -f # Remove build cache docker builder prune -af ``` ### Option 3: Set Up Automated Cleanup Add to your crontab (run every Sunday at 2 AM): ```bash crontab -e # Add this line: 0 2 * * 0 /Users/urtzialfaro/Documents/bakery-ia/scripts/cleanup-docker.sh --auto >> /tmp/docker-cleanup.log 2>&1 ``` ## Monitoring Disk Usage ### Check Docker disk usage: ```bash docker system df ``` ### Check Kind node disk usage: ```bash docker exec bakery-ia-local-control-plane df -h /var ``` ### Alert thresholds: - **< 70%**: Healthy ✅ - **70-85%**: Consider cleanup soon ⚠️ - **> 85%**: Run cleanup immediately 🚨 - **> 95%**: Critical - pods will fail ❌ ## Prevention Tips 1. **Run cleanup weekly** to prevent accumulation 2. **Monitor disk usage** before long dev sessions 3. **Delete old Kind clusters** when switching projects: ```bash kind delete cluster --name bakery-ia-local ``` 4. **Increase Docker disk allocation** in Docker Desktop settings if you frequently rebuild many services ## Troubleshooting ### Pods in CrashLoopBackOff after disk issues: 1. Run cleanup (see Quick Fix above) 2. Restart failed pods: ```bash kubectl get pods -n bakery-ia | grep -E "(CrashLoopBackOff|Error)" | awk '{print $1}' | xargs kubectl delete pod -n bakery-ia ``` ### Cleanup didn't free enough space: If still above 90% after cleanup: ```bash # Nuclear option - rebuild everything kind delete cluster --name bakery-ia-local docker system prune -a --volumes -f # Then recreate cluster with your setup scripts ``` ## What Happened Today (2026-01-12) - **Issue**: Disk was 100% full (113GB/113GB), causing database pods to crash - **Root cause**: 122 unused Docker images + 16 unused volumes + 6GB build cache - **Solution**: Ran `docker system prune -a --volumes -f` - **Result**: Freed 89GB, disk now at 22% usage (24GB/113GB) - **All services recovered successfully**