Files
bakery-ia/ARCHITECTURE_QUICK_REFERENCE.md
2025-10-01 12:17:59 +02:00

8.3 KiB

Service Initialization - Quick Reference

The Problem You Identified

Question: "We have a migration job that runs Alembic migrations. Why should we also run migrations in the service init process?"

Answer: You shouldn't! This is architectural redundancy that should be fixed.

Current State (Redundant )

┌─────────────────────────────────────────┐
│  Kubernetes Deployment Starts           │
└─────────────────────────────────────────┘
                  ↓
┌─────────────────────────────────────────┐
│  1. Migration Job Runs                  │
│     - Command: run_migrations.py        │
│     - Calls: initialize_service_database│
│     - Runs: alembic upgrade head        │
│     - Status: Complete ✓                │
└─────────────────────────────────────────┘
                  ↓
┌─────────────────────────────────────────┐
│  2. Service Pod Starts                  │
│     - Startup: _handle_database_tables()│
│     - Calls: initialize_service_database│ ← REDUNDANT!
│     - Runs: alembic upgrade head        │ ← REDUNDANT!
│     - Status: Complete ✓                │
└─────────────────────────────────────────┘
                  ↓
         Service Ready (Slower)

Problems:

  • Same code runs twice
  • 1-2 seconds slower startup per pod
  • Confusion: who is responsible for migrations?
  • Race conditions possible with multiple replicas
┌─────────────────────────────────────────┐
│  Kubernetes Deployment Starts           │
└─────────────────────────────────────────┘
                  ↓
┌─────────────────────────────────────────┐
│  1. Migration Job Runs                  │
│     - Command: run_migrations.py        │
│     - Runs: alembic upgrade head        │
│     - Status: Complete ✓                │
└─────────────────────────────────────────┘
                  ↓
┌─────────────────────────────────────────┐
│  2. Service Pod Starts                  │
│     - Startup: _verify_database_ready() │ ← VERIFY ONLY!
│     - Checks: Tables exist? ✓           │
│     - Checks: Alembic version? ✓        │
│     - NO migration execution            │
└─────────────────────────────────────────┘
                  ↓
         Service Ready (Faster!)

Benefits:

  • Clear separation of concerns
  • 50-80% faster service startup
  • No race conditions
  • Easier debugging

Implementation (3 Simple Changes)

1. Add to shared/database/init_manager.py

class DatabaseInitManager:
    def __init__(
        self,
        # ... existing params
        verify_only: bool = False  # ← ADD THIS
    ):
        self.verify_only = verify_only

    async def initialize_database(self) -> Dict[str, Any]:
        if self.verify_only:
            # Only check DB is ready, don't run migrations
            return await self._verify_database_state()

        # Existing full initialization
        # ...

2. Update shared/service_base.py

async def _handle_database_tables(self):
    skip_migrations = os.getenv("SKIP_MIGRATIONS", "false").lower() == "true"

    result = await initialize_service_database(
        database_manager=self.database_manager,
        service_name=self.service_name,
        verify_only=skip_migrations  # ← ADD THIS PARAMETER
    )

3. Add to Kubernetes Deployments

containers:
- name: external-service
  env:
  - name: SKIP_MIGRATIONS  # ← ADD THIS
    value: "true"          # Service only verifies, doesn't run migrations
  - name: ENVIRONMENT
    value: "production"    # Disable create_all fallback

Quick Decision Matrix

Environment SKIP_MIGRATIONS ENVIRONMENT Behavior
Development false development Full check, allow create_all
Staging true staging Verify only, fail if not ready
Production true production Verify only, fail if not ready

What Each Component Does

Migration Job (runs once on deployment)

✓ Creates tables (if first deployment)
✓ Runs pending migrations
✓ Updates alembic_version
✗ Does NOT start service

Service Startup (runs on every pod)

With SKIP_MIGRATIONS=false (current):

✓ Checks database connection
✓ Checks for migrations
✓ Runs alembic upgrade head ← REDUNDANT
✓ Starts service
Time: ~3-5 seconds

With SKIP_MIGRATIONS=true (recommended):

✓ Checks database connection
✓ Verifies tables exist
✓ Verifies alembic_version exists
✗ Does NOT run migrations
✓ Starts service
Time: ~1-2 seconds ← 50-60% FASTER

Testing the Change

Before (Current Behavior):

# Check service logs
kubectl logs -n bakery-ia deployment/external-service | grep -i migration

# Output shows:
[info] Running pending migrations service=external
INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
[info] Migrations applied successfully service=external

After (With SKIP_MIGRATIONS=true):

# Check service logs
kubectl logs -n bakery-ia deployment/external-service | grep -i migration

# Output shows:
[info] Migration skip enabled - verifying database only
[info] Database verified successfully

Rollout Strategy

Step 1: Development (Test)

# In local development, test the change:
export SKIP_MIGRATIONS=true
# Start service - should verify DB and start fast

Step 2: Staging (Validate)

# Update staging manifests
env:
  - name: SKIP_MIGRATIONS
    value: "true"

Step 3: Production (Deploy)

# Update production manifests
env:
  - name: SKIP_MIGRATIONS
    value: "true"
  - name: ENVIRONMENT
    value: "production"

Expected Results

Performance:

  • 📊 Service startup: 3-5s → 1-2s (50-60% faster)
  • 📊 Horizontal scaling: Immediate (no migration check delay)
  • 📊 Database load: Reduced (no redundant migration queries)

Reliability:

  • 🛡️ No race conditions: Only job handles migrations
  • 🛡️ Clear errors: "DB not ready" vs "migration failed"
  • 🛡️ Fail-fast: Services won't start if DB not initialized

Maintainability:

  • 📝 Clear logs: Migration job logs separate from service logs
  • 📝 Easier debugging: Check job for migration issues
  • 📝 Clean architecture: Operations separated from application

FAQs

Q: What if migrations fail in the job? A: Service pods won't start (they'll fail verification), which is correct behavior.

Q: What about development where I want fast iteration? A: Keep SKIP_MIGRATIONS=false in development. Services will still run migrations.

Q: Is this backwards compatible? A: Yes! Default behavior is unchanged. SKIP_MIGRATIONS only activates when explicitly set.

Q: What about database schema drift? A: Services verify schema on startup (check alembic_version). If drift detected, startup fails.

Q: Can I still use create_all() in development? A: Yes! Set ENVIRONMENT=development and SKIP_MIGRATIONS=false.

Summary

Your Question: Why run migrations in both job and service?

Answer: You shouldn't! This is redundant architecture.

Solution: Add SKIP_MIGRATIONS=true to service deployments.

Result: Faster, clearer, more reliable service initialization.

See Full Details: SERVICE_INITIALIZATION_ARCHITECTURE.md