REFACTOR production scheduler

This commit is contained in:
Urtzi Alfaro
2025-10-09 18:01:24 +02:00
parent 3c689b4f98
commit b420af32c5
13 changed files with 4046 additions and 6 deletions

View File

@@ -0,0 +1,414 @@
# Production Planning Scheduler - Quick Start Guide
**For Developers & DevOps**
---
## 🚀 5-Minute Setup
### Prerequisites
```bash
# Running services
- PostgreSQL (production, orders, tenant databases)
- Redis (for forecast caching)
- RabbitMQ (for events and leader election)
# Environment variables
PRODUCTION_DATABASE_URL=postgresql://...
ORDERS_DATABASE_URL=postgresql://...
TENANT_DATABASE_URL=postgresql://...
REDIS_URL=redis://localhost:6379/0
RABBITMQ_URL=amqp://guest:guest@localhost:5672/
```
### Run Migrations
```bash
# Add timezone to tenants table
cd services/tenant
alembic upgrade head
# Verify migration
psql $TENANT_DATABASE_URL -c "SELECT id, name, timezone FROM tenants LIMIT 5;"
```
### Start Services
```bash
# Terminal 1 - Production Service (with scheduler)
cd services/production
uvicorn app.main:app --reload --port 8001
# Terminal 2 - Orders Service (with scheduler)
cd services/orders
uvicorn app.main:app --reload --port 8002
# Terminal 3 - Forecasting Service (with caching)
cd services/forecasting
uvicorn app.main:app --reload --port 8003
```
### Test Schedulers
```bash
# Test production scheduler
curl -X POST http://localhost:8001/test/production-scheduler
# Expected output:
{
"message": "Production scheduler test triggered successfully"
}
# Test procurement scheduler
curl -X POST http://localhost:8002/test/procurement-scheduler
# Expected output:
{
"message": "Procurement scheduler test triggered successfully"
}
# Check logs
tail -f services/production/logs/production.log | grep "schedule"
tail -f services/orders/logs/orders.log | grep "plan"
```
---
## 📋 Configuration
### Enable Test Mode (Development)
```bash
# Run schedulers every 30 minutes instead of daily
export PRODUCTION_TEST_MODE=true
export PROCUREMENT_TEST_MODE=true
export DEBUG=true
```
### Configure Tenant Timezone
```sql
-- Update tenant timezone
UPDATE tenants SET timezone = 'America/New_York' WHERE id = '{tenant_id}';
-- Verify
SELECT id, name, timezone FROM tenants WHERE id = '{tenant_id}';
```
### Check Redis Cache
```bash
# Connect to Redis
redis-cli
# Check forecast cache keys
KEYS forecast:*
# Get cache stats
GET forecast:cache:stats
# Clear cache (if needed)
FLUSHDB
```
---
## 🔍 Monitoring
### View Metrics (Prometheus)
```bash
# Production scheduler metrics
curl http://localhost:8001/metrics | grep production_schedules
# Procurement scheduler metrics
curl http://localhost:8002/metrics | grep procurement_plans
# Forecast cache metrics
curl http://localhost:8003/metrics | grep forecast_cache
```
### Key Metrics to Watch
```promql
# Scheduler success rate (should be > 95%)
rate(production_schedules_generated_total{status="success"}[5m])
rate(procurement_plans_generated_total{status="success"}[5m])
# Cache hit rate (should be > 70%)
forecast_cache_hit_rate
# Generation time (should be < 60s)
histogram_quantile(0.95,
rate(production_schedule_generation_duration_seconds_bucket[5m]))
```
---
## 🐛 Debugging
### Check Scheduler Status
```python
# In Python shell
from app.services.production_scheduler_service import ProductionSchedulerService
from app.core.config import settings
scheduler = ProductionSchedulerService(settings)
await scheduler.start()
# Check configured jobs
jobs = scheduler.scheduler.get_jobs()
for job in jobs:
print(f"{job.name}: next run at {job.next_run_time}")
```
### View Scheduler Logs
```bash
# Production scheduler
kubectl logs -f deployment/production-service | grep -E "scheduler|schedule"
# Procurement scheduler
kubectl logs -f deployment/orders-service | grep -E "scheduler|plan"
# Look for these patterns:
# ✅ "Daily production planning completed"
# ✅ "Production schedule created successfully"
# ❌ "Error processing tenant production"
# ⚠️ "Tenant processing timed out"
```
### Test Timezone Handling
```python
from shared.utils.timezone_helper import TimezoneHelper
# Get current date in different timezones
madrid_date = TimezoneHelper.get_current_date_in_timezone("Europe/Madrid")
ny_date = TimezoneHelper.get_current_date_in_timezone("America/New_York")
tokyo_date = TimezoneHelper.get_current_date_in_timezone("Asia/Tokyo")
print(f"Madrid: {madrid_date}")
print(f"NY: {ny_date}")
print(f"Tokyo: {tokyo_date}")
# Check if business hours
is_business = TimezoneHelper.is_business_hours(
timezone_str="Europe/Madrid",
start_hour=8,
end_hour=20
)
print(f"Business hours: {is_business}")
```
### Test Forecast Cache
```python
from services.forecasting.app.services.forecast_cache import get_forecast_cache_service
from datetime import date
from uuid import UUID
cache = get_forecast_cache_service(redis_url="redis://localhost:6379/0")
# Check if available
print(f"Cache available: {cache.is_available()}")
# Get cache stats
stats = cache.get_cache_stats()
print(f"Cache stats: {stats}")
# Test cache operation
tenant_id = UUID("your-tenant-id")
product_id = UUID("your-product-id")
forecast_date = date.today()
# Try to get cached forecast
cached = await cache.get_cached_forecast(tenant_id, product_id, forecast_date)
print(f"Cached forecast: {cached}")
```
---
## 🧪 Testing
### Unit Tests
```bash
# Run scheduler tests
pytest services/production/tests/test_production_scheduler_service.py -v
pytest services/orders/tests/test_procurement_scheduler_service.py -v
# Run cache tests
pytest services/forecasting/tests/test_forecast_cache.py -v
# Run timezone tests
pytest shared/tests/test_timezone_helper.py -v
```
### Integration Tests
```bash
# Run full scheduler integration test
pytest tests/integration/test_scheduler_integration.py -v
# Run cache integration test
pytest tests/integration/test_cache_integration.py -v
# Run plan rejection workflow test
pytest tests/integration/test_plan_rejection_workflow.py -v
```
### Manual End-to-End Test
```bash
# 1. Clear existing schedules/plans
psql $PRODUCTION_DATABASE_URL -c "DELETE FROM production_schedules WHERE schedule_date = CURRENT_DATE;"
psql $ORDERS_DATABASE_URL -c "DELETE FROM procurement_plans WHERE plan_date = CURRENT_DATE;"
# 2. Trigger schedulers
curl -X POST http://localhost:8001/test/production-scheduler
curl -X POST http://localhost:8002/test/procurement-scheduler
# 3. Wait 30 seconds
# 4. Verify schedules/plans created
psql $PRODUCTION_DATABASE_URL -c "SELECT id, schedule_date, status FROM production_schedules WHERE schedule_date = CURRENT_DATE;"
psql $ORDERS_DATABASE_URL -c "SELECT id, plan_date, status FROM procurement_plans WHERE plan_date = CURRENT_DATE;"
# 5. Check cache hit rate
redis-cli GET forecast_cache_hits_total
redis-cli GET forecast_cache_misses_total
```
---
## 📚 Common Commands
### Scheduler Management
```bash
# Disable scheduler (maintenance mode)
kubectl set env deployment/production-service SCHEDULER_DISABLED=true
# Re-enable scheduler
kubectl set env deployment/production-service SCHEDULER_DISABLED-
# Check scheduler health
curl http://localhost:8001/health | jq .custom_checks.scheduler_service
# Manually trigger scheduler
curl -X POST http://localhost:8001/test/production-scheduler
```
### Cache Management
```bash
# View cache stats
curl http://localhost:8003/api/v1/{tenant_id}/forecasting/cache/stats | jq .
# Clear product cache
curl -X DELETE http://localhost:8003/api/v1/{tenant_id}/forecasting/cache/product/{product_id}
# Clear tenant cache
curl -X DELETE http://localhost:8003/api/v1/{tenant_id}/forecasting/cache
# View cache keys
redis-cli KEYS "forecast:*" | head -20
```
### Database Queries
```sql
-- Check production schedules
SELECT id, schedule_date, status, total_batches, auto_generated
FROM production_schedules
WHERE schedule_date >= CURRENT_DATE - INTERVAL '7 days'
ORDER BY schedule_date DESC;
-- Check procurement plans
SELECT id, plan_date, status, total_requirements, total_estimated_cost
FROM procurement_plans
WHERE plan_date >= CURRENT_DATE - INTERVAL '7 days'
ORDER BY plan_date DESC;
-- Check tenant timezones
SELECT id, name, timezone, city
FROM tenants
WHERE is_active = true
ORDER BY timezone;
-- Check plan approval workflow
SELECT id, plan_number, status, approval_workflow
FROM procurement_plans
WHERE status = 'cancelled'
ORDER BY created_at DESC
LIMIT 10;
```
---
## 🔧 Troubleshooting Quick Fixes
### Scheduler Not Running
```bash
# Check if service is running
ps aux | grep uvicorn
# Check if scheduler initialized
grep "scheduled jobs configured" logs/production.log
# Restart service
pkill -f "uvicorn app.main:app"
uvicorn app.main:app --reload
```
### Cache Not Working
```bash
# Check Redis connection
redis-cli ping # Should return PONG
# Check Redis keys
redis-cli DBSIZE # Should have keys
# Restart Redis (if needed)
redis-cli SHUTDOWN
redis-server --daemonize yes
```
### Wrong Timezone
```bash
# Check server timezone (should be UTC)
date
# Check tenant timezone
psql $TENANT_DATABASE_URL -c \
"SELECT timezone FROM tenants WHERE id = '{tenant_id}';"
# Update if wrong
psql $TENANT_DATABASE_URL -c \
"UPDATE tenants SET timezone = 'Europe/Madrid' WHERE id = '{tenant_id}';"
```
---
## 📖 Additional Resources
- **Full Documentation:** [PRODUCTION_PLANNING_SYSTEM.md](./PRODUCTION_PLANNING_SYSTEM.md)
- **Operational Runbook:** [SCHEDULER_RUNBOOK.md](./SCHEDULER_RUNBOOK.md)
- **Implementation Summary:** [IMPLEMENTATION_SUMMARY.md](./IMPLEMENTATION_SUMMARY.md)
- **Code:**
- Production Scheduler: [`services/production/app/services/production_scheduler_service.py`](../services/production/app/services/production_scheduler_service.py)
- Procurement Scheduler: [`services/orders/app/services/procurement_scheduler_service.py`](../services/orders/app/services/procurement_scheduler_service.py)
- Forecast Cache: [`services/forecasting/app/services/forecast_cache.py`](../services/forecasting/app/services/forecast_cache.py)
- Timezone Helper: [`shared/utils/timezone_helper.py`](../shared/utils/timezone_helper.py)
---
**Version:** 1.0
**Last Updated:** 2025-10-09
**Maintained By:** Backend Team