Files
bakery-ia/DOCKER_MAINTENANCE.md
2026-01-12 22:15:11 +01:00

2.9 KiB

Docker Maintenance Guide for Local Development

The Problem

When developing with Tilt and local Kubernetes (Kind), Docker accumulates:

  • Multiple image versions from each code change (Tilt rebuilds)
  • Unused volumes from previous cluster runs
  • Build cache that grows over time

This quickly fills up disk space, causing pods to fail with "No space left on device" errors.

Quick Fix (When You Hit Disk Issues)

# Clean up all unused Docker resources
docker system prune -a --volumes -f

This removes:

  • All unused images
  • All unused volumes
  • All build cache

Expected recovery: 60-100GB

Regular Maintenance

Run the maintenance script weekly:

./scripts/cleanup-docker.sh

Or run it automatically without confirmation:

./scripts/cleanup-docker.sh --auto

Option 2: Manual Commands

# Remove images older than 24 hours
docker image prune -af --filter "until=24h"

# Remove unused volumes
docker volume prune -f

# Remove build cache
docker builder prune -af

Option 3: Set Up Automated Cleanup

Add to your crontab (run every Sunday at 2 AM):

crontab -e
# Add this line:
0 2 * * 0 /Users/urtzialfaro/Documents/bakery-ia/scripts/cleanup-docker.sh --auto >> /tmp/docker-cleanup.log 2>&1

Monitoring Disk Usage

Check Docker disk usage:

docker system df

Check Kind node disk usage:

docker exec bakery-ia-local-control-plane df -h /var

Alert thresholds:

  • < 70%: Healthy
  • 70-85%: Consider cleanup soon ⚠️
  • > 85%: Run cleanup immediately 🚨
  • > 95%: Critical - pods will fail

Prevention Tips

  1. Run cleanup weekly to prevent accumulation
  2. Monitor disk usage before long dev sessions
  3. Delete old Kind clusters when switching projects:
    kind delete cluster --name bakery-ia-local
    
  4. Increase Docker disk allocation in Docker Desktop settings if you frequently rebuild many services

Troubleshooting

Pods in CrashLoopBackOff after disk issues:

  1. Run cleanup (see Quick Fix above)
  2. Restart failed pods:
    kubectl get pods -n bakery-ia | grep -E "(CrashLoopBackOff|Error)" | awk '{print $1}' | xargs kubectl delete pod -n bakery-ia
    

Cleanup didn't free enough space:

If still above 90% after cleanup:

# Nuclear option - rebuild everything
kind delete cluster --name bakery-ia-local
docker system prune -a --volumes -f
# Then recreate cluster with your setup scripts

What Happened Today (2026-01-12)

  • Issue: Disk was 100% full (113GB/113GB), causing database pods to crash
  • Root cause: 122 unused Docker images + 16 unused volumes + 6GB build cache
  • Solution: Ran docker system prune -a --volumes -f
  • Result: Freed 89GB, disk now at 22% usage (24GB/113GB)
  • All services recovered successfully