7.9 KiB
Base Image Caching Solution for Docker Hub Rate Limiting
Overview
This solution provides a simple, short-term approach to reduce Docker Hub usage by pre-pulling and caching base images. It's designed to be implemented quickly while providing significant benefits.
Problem Addressed
- Docker Hub Rate Limiting: 100 pulls/6h for anonymous users
- Build Failures: Timeouts and authentication errors during CI/CD
- Inconsistent Builds: Different base image versions causing issues
Solution Architecture
[Docker Hub] → [Pre-Pull Script] → [Local Cache/Registry] → [Service Builds]
Implementation Options
Option 1: Simple Docker Cache (Easiest)
# Just run the prepull script
./scripts/prepull-base-images.sh
How it works:
- Pulls all base images once with authentication
- Docker caches them locally
- Subsequent builds use cached images
- Reduces Docker Hub pulls by ~90%
Option 2: Local Registry (More Robust)
# Start local registry
docker run -d -p 5000:5000 --name bakery-registry \
-v $(pwd)/registry-data:/var/lib/registry \
registry:2
# Run prepull script with local registry enabled
USE_LOCAL_REGISTRY=true ./scripts/prepull-base-images.sh
How it works:
- Runs a local Docker registry
- Pre-pull script pushes images to local registry
- All builds pull from local registry
- Can be shared across team members
Option 3: Pull-Through Cache (Most Advanced)
# Configure Docker daemon (docker daemon.json)
{
"registry-mirrors": ["http://localhost:5000"],
"insecure-registries": ["localhost:5000"]
}
# Start registry as pull-through cache
docker run -d -p 5000:5000 --name bakery-registry \
-v $(pwd)/registry-data:/var/lib/registry \
-e REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io \
registry:2
How it works:
- Local registry acts as transparent cache
- First request pulls from Docker Hub and caches
- Subsequent requests served from cache
- Completely transparent to builds
Quick Start Guide
1. Simple Caching (5 minutes)
# Make script executable
chmod +x scripts/prepull-base-images.sh
# Run the script
./scripts/prepull-base-images.sh
# Verify images are cached
docker images | grep -E "python:3.11-slim|postgres:17-alpine"
2. Local Registry (10 minutes)
# Build local registry image
cd scripts/local-registry
docker build -t bakery-registry .
# Start registry
docker run -d -p 5000:5000 --name bakery-registry \
-v $(pwd)/registry-data:/var/lib/registry \
bakery-registry
# Run prepull with local registry
USE_LOCAL_REGISTRY=true ../prepull-base-images.sh
# Verify registry contents
curl http://localhost:5000/v2/_catalog
3. CI/CD Integration
GitHub Actions Example:
jobs:
setup:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker
uses: docker/setup-buildx-action@v2
- name: Login to Docker Hub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Pre-pull base images
run: ./scripts/prepull-base-images.sh
- name: Cache Docker layers
uses: actions/cache@v3
with:
path: /tmp/.buildx-cache
key: ${{ runner.os }}-buildx-${{ github.sha }}
restore-keys: |
${{ runner.os }}-buildx-
build:
needs: setup
runs-on: ubuntu-latest
steps:
- name: Build services
run: ./scripts/build-services.sh
Tekton Pipeline Example:
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: prepull-base-images
spec:
steps:
- name: login-to-docker
image: docker:cli
script: |
echo "$DOCKER_PASSWORD" | docker login -u "$DOCKER_USERNAME" --password-stdin
env:
- name: DOCKER_USERNAME
valueFrom:
secretKeyRef:
name: docker-creds
key: username
- name: DOCKER_PASSWORD
valueFrom:
secretKeyRef:
name: docker-creds
key: password
- name: prepull-images
image: docker:cli
script: |
#!/bin/bash
images=("python:3.11-slim" "postgres:17-alpine" "redis:7.4-alpine")
for img in "${images[@]}"; do
echo "Pulling $img..."
docker pull "$img"
done
Base Images Covered
The script pre-pulls all base images used in the Bakery-IA project:
Primary Base Images
python:3.11-slim- Main Python runtimepostgres:17-alpine- Database init containersredis:7.4-alpine- Redis init containers
Utility Images
busybox:1.36- Lightweight utility containerbusybox:latest- Latest busyboxcurlimages/curl:latest- Curl utilitybitnami/kubectl:1.28- Kubernetes CLI
Build System Images
alpine:3.18- Lightweight basealpine:3.19- Latest Alpinegcr.io/kaniko-project/executor:v1.23.0- Kaniko builderalpine/git:2.43.0- Git client
Benefits
Immediate Benefits
- Reduces Docker Hub pulls by 90%+ - Only pull each base image once
- Eliminates rate limiting issues - Authenticated pulls with proper credentials
- Faster builds - Cached images load instantly
- More reliable CI/CD - No more timeout failures
Long-Term Benefits
- Consistent build environments - Same base images for all builds
- Easier debugging - Known image versions
- Better security - Controlled image updates
- Foundation for improvement - Can evolve to pull-through cache
Monitoring and Maintenance
Check Cache Status
# List cached images
docker images
# Check disk usage
docker system df
# Clean up old images
docker image prune -a
Update Base Images
# Run prepull script monthly to get updates
./scripts/prepull-base-images.sh
# Or create a cron job
0 3 1 * * /path/to/prepull-base-images.sh
Security Considerations
Credential Management
- Store Docker Hub credentials in secrets management system
- Rotate credentials periodically
- Use least-privilege access
Image Verification
# Verify image integrity
docker trust inspect python:3.11-slim
# Scan for vulnerabilities
docker scan python:3.11-slim
Comparison with Other Solutions
| Solution | Complexity | Docker Hub Usage | Implementation Time | Maintenance |
|---|---|---|---|---|
| This Solution | Low | Very Low | 5-30 minutes | Low |
| GHCR Migration | Medium | None | 1-2 days | Medium |
| Pull-Through Cache | Medium | Very Low | 1 day | Medium |
| Immutable Base Images | High | None | 1-2 weeks | High |
Migration Path
This solution can evolve over time:
Phase 1: Simple caching (Current) → Phase 2: Local registry → Phase 3: Pull-through cache → Phase 4: Immutable base images
Troubleshooting
Common Issues
Issue: Authentication fails
# Solution: Verify credentials
docker login -u your-username
echo "$DOCKER_PASSWORD" | docker login -u "$DOCKER_USERNAME" --password-stdin
Issue: Local registry not accessible
# Solution: Check registry status
docker ps | grep registry
curl http://localhost:5000/v2/
Issue: Images not found in cache
# Solution: Verify images are pulled
docker images | grep python:3.11-slim
# If missing, pull manually
docker pull python:3.11-slim
Conclusion
This simple base image caching solution provides an immediate fix for Docker Hub rate limiting issues while requiring minimal changes to your existing infrastructure. It serves as both a short-term solution and a foundation for more advanced caching strategies in the future.
Recommended Next Steps:
- Implement simple caching first
- Monitor Docker Hub usage reduction
- Consider adding local registry if needed
- Plan for long-term solution (GHCR or immutable base images)