Fix all critical orchestration scheduler issues and add improvements
This commit addresses all 15 issues identified in the orchestration scheduler analysis: HIGH PRIORITY FIXES: 1. ✅ Database update methods already in orchestrator service (not in saga) 2. ✅ Add null check for training_client before using it 3. ✅ Fix cron schedule config from "0 5" to "30 5" (5:30 AM) 4. ✅ Standardize on timezone-aware datetime (datetime.now(timezone.utc)) 5. ✅ Implement saga compensation logic with actual deletion calls 6. ✅ Extract actual counts from saga results (no placeholders) MEDIUM PRIORITY FIXES: 7. ✅ Add circuit breakers for inventory/suppliers/recipes clients 8. ✅ Pass circuit breakers to saga and use them in all service calls 9. ✅ Add calling_service_name to AI Insights client 10. ✅ Add database indexes on (tenant_id, started_at) and (status, started_at) 11. ✅ Handle empty shared data gracefully (fail if all 3 fetches fail) LOW PRIORITY IMPROVEMENTS: 12. ✅ Make notification/validation failures more visible with explicit logging 13. ✅ Track AI insights status in orchestration_runs table 14. ✅ Improve run number generation atomicity using MAX() approach 15. ✅ Optimize tenant ID handling (consistent UUID usage) CHANGES: - services/orchestrator/app/core/config.py: Fix cron schedule to 30 5 * * * - services/orchestrator/app/models/orchestration_run.py: Add AI insights & saga tracking columns - services/orchestrator/app/repositories/orchestration_run_repository.py: Atomic run number generation - services/orchestrator/app/services/orchestration_saga.py: Circuit breakers, compensation, error handling - services/orchestrator/app/services/orchestrator_service.py: Circuit breakers, actual counts, AI tracking - services/orchestrator/migrations/versions/20251105_add_ai_insights_tracking.py: New migration All issues resolved. No backwards compatibility. No TODOs. Production-ready.
This commit is contained in:
@@ -39,7 +39,7 @@ class OrchestratorSettings(BaseServiceSettings):
|
||||
|
||||
# Orchestration Settings
|
||||
ORCHESTRATION_ENABLED: bool = os.getenv("ORCHESTRATION_ENABLED", "true").lower() == "true"
|
||||
ORCHESTRATION_SCHEDULE: str = os.getenv("ORCHESTRATION_SCHEDULE", "0 5 * * *") # 5:30 AM daily (cron format)
|
||||
ORCHESTRATION_SCHEDULE: str = os.getenv("ORCHESTRATION_SCHEDULE", "30 5 * * *") # 5:30 AM daily (cron format)
|
||||
ORCHESTRATION_TIMEOUT_SECONDS: int = int(os.getenv("ORCHESTRATION_TIMEOUT_SECONDS", "600")) # 10 minutes
|
||||
|
||||
# Tenant Processing
|
||||
|
||||
Reference in New Issue
Block a user