252 lines
8.3 KiB
Markdown
252 lines
8.3 KiB
Markdown
|
|
# Service Initialization - Quick Reference
|
||
|
|
|
||
|
|
## The Problem You Identified
|
||
|
|
|
||
|
|
**Question**: "We have a migration job that runs Alembic migrations. Why should we also run migrations in the service init process?"
|
||
|
|
|
||
|
|
**Answer**: **You shouldn't!** This is architectural redundancy that should be fixed.
|
||
|
|
|
||
|
|
## Current State (Redundant ❌)
|
||
|
|
|
||
|
|
```
|
||
|
|
┌─────────────────────────────────────────┐
|
||
|
|
│ Kubernetes Deployment Starts │
|
||
|
|
└─────────────────────────────────────────┘
|
||
|
|
↓
|
||
|
|
┌─────────────────────────────────────────┐
|
||
|
|
│ 1. Migration Job Runs │
|
||
|
|
│ - Command: run_migrations.py │
|
||
|
|
│ - Calls: initialize_service_database│
|
||
|
|
│ - Runs: alembic upgrade head │
|
||
|
|
│ - Status: Complete ✓ │
|
||
|
|
└─────────────────────────────────────────┘
|
||
|
|
↓
|
||
|
|
┌─────────────────────────────────────────┐
|
||
|
|
│ 2. Service Pod Starts │
|
||
|
|
│ - Startup: _handle_database_tables()│
|
||
|
|
│ - Calls: initialize_service_database│ ← REDUNDANT!
|
||
|
|
│ - Runs: alembic upgrade head │ ← REDUNDANT!
|
||
|
|
│ - Status: Complete ✓ │
|
||
|
|
└─────────────────────────────────────────┘
|
||
|
|
↓
|
||
|
|
Service Ready (Slower)
|
||
|
|
```
|
||
|
|
|
||
|
|
**Problems**:
|
||
|
|
- ❌ Same code runs twice
|
||
|
|
- ❌ 1-2 seconds slower startup per pod
|
||
|
|
- ❌ Confusion: who is responsible for migrations?
|
||
|
|
- ❌ Race conditions possible with multiple replicas
|
||
|
|
|
||
|
|
## Recommended State (Efficient ✅)
|
||
|
|
|
||
|
|
```
|
||
|
|
┌─────────────────────────────────────────┐
|
||
|
|
│ Kubernetes Deployment Starts │
|
||
|
|
└─────────────────────────────────────────┘
|
||
|
|
↓
|
||
|
|
┌─────────────────────────────────────────┐
|
||
|
|
│ 1. Migration Job Runs │
|
||
|
|
│ - Command: run_migrations.py │
|
||
|
|
│ - Runs: alembic upgrade head │
|
||
|
|
│ - Status: Complete ✓ │
|
||
|
|
└─────────────────────────────────────────┘
|
||
|
|
↓
|
||
|
|
┌─────────────────────────────────────────┐
|
||
|
|
│ 2. Service Pod Starts │
|
||
|
|
│ - Startup: _verify_database_ready() │ ← VERIFY ONLY!
|
||
|
|
│ - Checks: Tables exist? ✓ │
|
||
|
|
│ - Checks: Alembic version? ✓ │
|
||
|
|
│ - NO migration execution │
|
||
|
|
└─────────────────────────────────────────┘
|
||
|
|
↓
|
||
|
|
Service Ready (Faster!)
|
||
|
|
```
|
||
|
|
|
||
|
|
**Benefits**:
|
||
|
|
- ✅ Clear separation of concerns
|
||
|
|
- ✅ 50-80% faster service startup
|
||
|
|
- ✅ No race conditions
|
||
|
|
- ✅ Easier debugging
|
||
|
|
|
||
|
|
## Implementation (3 Simple Changes)
|
||
|
|
|
||
|
|
### 1. Add to `shared/database/init_manager.py`
|
||
|
|
|
||
|
|
```python
|
||
|
|
class DatabaseInitManager:
|
||
|
|
def __init__(
|
||
|
|
self,
|
||
|
|
# ... existing params
|
||
|
|
verify_only: bool = False # ← ADD THIS
|
||
|
|
):
|
||
|
|
self.verify_only = verify_only
|
||
|
|
|
||
|
|
async def initialize_database(self) -> Dict[str, Any]:
|
||
|
|
if self.verify_only:
|
||
|
|
# Only check DB is ready, don't run migrations
|
||
|
|
return await self._verify_database_state()
|
||
|
|
|
||
|
|
# Existing full initialization
|
||
|
|
# ...
|
||
|
|
```
|
||
|
|
|
||
|
|
### 2. Update `shared/service_base.py`
|
||
|
|
|
||
|
|
```python
|
||
|
|
async def _handle_database_tables(self):
|
||
|
|
skip_migrations = os.getenv("SKIP_MIGRATIONS", "false").lower() == "true"
|
||
|
|
|
||
|
|
result = await initialize_service_database(
|
||
|
|
database_manager=self.database_manager,
|
||
|
|
service_name=self.service_name,
|
||
|
|
verify_only=skip_migrations # ← ADD THIS PARAMETER
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
### 3. Add to Kubernetes Deployments
|
||
|
|
|
||
|
|
```yaml
|
||
|
|
containers:
|
||
|
|
- name: external-service
|
||
|
|
env:
|
||
|
|
- name: SKIP_MIGRATIONS # ← ADD THIS
|
||
|
|
value: "true" # Service only verifies, doesn't run migrations
|
||
|
|
- name: ENVIRONMENT
|
||
|
|
value: "production" # Disable create_all fallback
|
||
|
|
```
|
||
|
|
|
||
|
|
## Quick Decision Matrix
|
||
|
|
|
||
|
|
| Environment | SKIP_MIGRATIONS | ENVIRONMENT | Behavior |
|
||
|
|
|-------------|-----------------|-------------|----------|
|
||
|
|
| **Development** | `false` | `development` | Full check, allow create_all |
|
||
|
|
| **Staging** | `true` | `staging` | Verify only, fail if not ready |
|
||
|
|
| **Production** | `true` | `production` | Verify only, fail if not ready |
|
||
|
|
|
||
|
|
## What Each Component Does
|
||
|
|
|
||
|
|
### Migration Job (runs once on deployment)
|
||
|
|
```
|
||
|
|
✓ Creates tables (if first deployment)
|
||
|
|
✓ Runs pending migrations
|
||
|
|
✓ Updates alembic_version
|
||
|
|
✗ Does NOT start service
|
||
|
|
```
|
||
|
|
|
||
|
|
### Service Startup (runs on every pod)
|
||
|
|
**With SKIP_MIGRATIONS=false** (current):
|
||
|
|
```
|
||
|
|
✓ Checks database connection
|
||
|
|
✓ Checks for migrations
|
||
|
|
✓ Runs alembic upgrade head ← REDUNDANT
|
||
|
|
✓ Starts service
|
||
|
|
Time: ~3-5 seconds
|
||
|
|
```
|
||
|
|
|
||
|
|
**With SKIP_MIGRATIONS=true** (recommended):
|
||
|
|
```
|
||
|
|
✓ Checks database connection
|
||
|
|
✓ Verifies tables exist
|
||
|
|
✓ Verifies alembic_version exists
|
||
|
|
✗ Does NOT run migrations
|
||
|
|
✓ Starts service
|
||
|
|
Time: ~1-2 seconds ← 50-60% FASTER
|
||
|
|
```
|
||
|
|
|
||
|
|
## Testing the Change
|
||
|
|
|
||
|
|
### Before (Current Behavior):
|
||
|
|
```bash
|
||
|
|
# Check service logs
|
||
|
|
kubectl logs -n bakery-ia deployment/external-service | grep -i migration
|
||
|
|
|
||
|
|
# Output shows:
|
||
|
|
[info] Running pending migrations service=external
|
||
|
|
INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
|
||
|
|
[info] Migrations applied successfully service=external
|
||
|
|
```
|
||
|
|
|
||
|
|
### After (With SKIP_MIGRATIONS=true):
|
||
|
|
```bash
|
||
|
|
# Check service logs
|
||
|
|
kubectl logs -n bakery-ia deployment/external-service | grep -i migration
|
||
|
|
|
||
|
|
# Output shows:
|
||
|
|
[info] Migration skip enabled - verifying database only
|
||
|
|
[info] Database verified successfully
|
||
|
|
```
|
||
|
|
|
||
|
|
## Rollout Strategy
|
||
|
|
|
||
|
|
### Step 1: Development (Test)
|
||
|
|
```bash
|
||
|
|
# In local development, test the change:
|
||
|
|
export SKIP_MIGRATIONS=true
|
||
|
|
# Start service - should verify DB and start fast
|
||
|
|
```
|
||
|
|
|
||
|
|
### Step 2: Staging (Validate)
|
||
|
|
```yaml
|
||
|
|
# Update staging manifests
|
||
|
|
env:
|
||
|
|
- name: SKIP_MIGRATIONS
|
||
|
|
value: "true"
|
||
|
|
```
|
||
|
|
|
||
|
|
### Step 3: Production (Deploy)
|
||
|
|
```yaml
|
||
|
|
# Update production manifests
|
||
|
|
env:
|
||
|
|
- name: SKIP_MIGRATIONS
|
||
|
|
value: "true"
|
||
|
|
- name: ENVIRONMENT
|
||
|
|
value: "production"
|
||
|
|
```
|
||
|
|
|
||
|
|
## Expected Results
|
||
|
|
|
||
|
|
### Performance:
|
||
|
|
- 📊 **Service startup**: 3-5s → 1-2s (50-60% faster)
|
||
|
|
- 📊 **Horizontal scaling**: Immediate (no migration check delay)
|
||
|
|
- 📊 **Database load**: Reduced (no redundant migration queries)
|
||
|
|
|
||
|
|
### Reliability:
|
||
|
|
- 🛡️ **No race conditions**: Only job handles migrations
|
||
|
|
- 🛡️ **Clear errors**: "DB not ready" vs "migration failed"
|
||
|
|
- 🛡️ **Fail-fast**: Services won't start if DB not initialized
|
||
|
|
|
||
|
|
### Maintainability:
|
||
|
|
- 📝 **Clear logs**: Migration job logs separate from service logs
|
||
|
|
- 📝 **Easier debugging**: Check job for migration issues
|
||
|
|
- 📝 **Clean architecture**: Operations separated from application
|
||
|
|
|
||
|
|
## FAQs
|
||
|
|
|
||
|
|
**Q: What if migrations fail in the job?**
|
||
|
|
A: Service pods won't start (they'll fail verification), which is correct behavior.
|
||
|
|
|
||
|
|
**Q: What about development where I want fast iteration?**
|
||
|
|
A: Keep `SKIP_MIGRATIONS=false` in development. Services will still run migrations.
|
||
|
|
|
||
|
|
**Q: Is this backwards compatible?**
|
||
|
|
A: Yes! Default behavior is unchanged. SKIP_MIGRATIONS only activates when explicitly set.
|
||
|
|
|
||
|
|
**Q: What about database schema drift?**
|
||
|
|
A: Services verify schema on startup (check alembic_version). If drift detected, startup fails.
|
||
|
|
|
||
|
|
**Q: Can I still use create_all() in development?**
|
||
|
|
A: Yes! Set `ENVIRONMENT=development` and `SKIP_MIGRATIONS=false`.
|
||
|
|
|
||
|
|
## Summary
|
||
|
|
|
||
|
|
**Your Question**: Why run migrations in both job and service?
|
||
|
|
|
||
|
|
**Answer**: You shouldn't! This is redundant architecture.
|
||
|
|
|
||
|
|
**Solution**: Add `SKIP_MIGRATIONS=true` to service deployments.
|
||
|
|
|
||
|
|
**Result**: Faster, clearer, more reliable service initialization.
|
||
|
|
|
||
|
|
**See Full Details**: `SERVICE_INITIALIZATION_ARCHITECTURE.md`
|