Files
bakery-ia/services/orchestrator/migrations/SCHEMA_DOCUMENTATION.md

296 lines
12 KiB
Markdown
Raw Normal View History

# Orchestration Service Database Schema
## Overview
This document describes the database schema for the Orchestration Service, which tracks and manages the execution of orchestration workflows across the bakery system.
## Schema Version History
### Initial Schema (001_initial_schema)
This is the consolidated initial schema that includes all tables, columns, indexes, and constraints from the original fragmented migrations.
**Consolidated from:**
- `20251029_1700_add_orchestration_runs.py` - Base orchestration_runs table
- `20251105_add_ai_insights_tracking.py` - AI insights tracking additions
## Tables
### orchestration_runs
The main audit trail table for orchestration executions. This table tracks the entire lifecycle of an orchestration run, including all workflow steps, results, and performance metrics.
#### Columns
##### Primary Identification
| Column | Type | Nullable | Description |
|--------|------|----------|-------------|
| `id` | UUID | No | Primary key, auto-generated UUID |
| `run_number` | VARCHAR(50) | No | Unique human-readable run identifier (indexed, unique) |
##### Run Details
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `tenant_id` | UUID | No | - | Tenant/organization identifier (indexed) |
| `status` | ENUM | No | 'pending' | Current run status (indexed) |
| `run_type` | VARCHAR(50) | No | 'scheduled' | Type of run: scheduled, manual, test (indexed) |
| `priority` | VARCHAR(20) | No | 'normal' | Run priority: normal, high, critical |
##### Timing
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `started_at` | TIMESTAMP | No | now() | When the run started (indexed) |
| `completed_at` | TIMESTAMP | Yes | NULL | When the run completed (indexed) |
| `duration_seconds` | INTEGER | Yes | NULL | Total duration in seconds |
##### Step Tracking - Forecasting
| Column | Type | Nullable | Description |
|--------|------|----------|-------------|
| `forecasting_started_at` | TIMESTAMP | Yes | When forecasting step started |
| `forecasting_completed_at` | TIMESTAMP | Yes | When forecasting step completed |
| `forecasting_status` | VARCHAR(20) | Yes | Status: success, failed, skipped |
| `forecasting_error` | TEXT | Yes | Error message if failed |
##### Step Tracking - Production
| Column | Type | Nullable | Description |
|--------|------|----------|-------------|
| `production_started_at` | TIMESTAMP | Yes | When production step started |
| `production_completed_at` | TIMESTAMP | Yes | When production step completed |
| `production_status` | VARCHAR(20) | Yes | Status: success, failed, skipped |
| `production_error` | TEXT | Yes | Error message if failed |
##### Step Tracking - Procurement
| Column | Type | Nullable | Description |
|--------|------|----------|-------------|
| `procurement_started_at` | TIMESTAMP | Yes | When procurement step started |
| `procurement_completed_at` | TIMESTAMP | Yes | When procurement step completed |
| `procurement_status` | VARCHAR(20) | Yes | Status: success, failed, skipped |
| `procurement_error` | TEXT | Yes | Error message if failed |
##### Step Tracking - Notifications
| Column | Type | Nullable | Description |
|--------|------|----------|-------------|
| `notification_started_at` | TIMESTAMP | Yes | When notification step started |
| `notification_completed_at` | TIMESTAMP | Yes | When notification step completed |
| `notification_status` | VARCHAR(20) | Yes | Status: success, failed, skipped |
| `notification_error` | TEXT | Yes | Error message if failed |
##### Step Tracking - AI Insights
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `ai_insights_started_at` | TIMESTAMP | Yes | NULL | When AI insights step started |
| `ai_insights_completed_at` | TIMESTAMP | Yes | NULL | When AI insights step completed |
| `ai_insights_status` | VARCHAR(20) | Yes | NULL | Status: success, failed, skipped |
| `ai_insights_error` | TEXT | Yes | NULL | Error message if failed |
| `ai_insights_generated` | INTEGER | No | 0 | Number of AI insights generated |
| `ai_insights_posted` | INTEGER | No | 0 | Number of AI insights posted |
##### Results Summary
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `forecasts_generated` | INTEGER | No | 0 | Total forecasts generated |
| `production_batches_created` | INTEGER | No | 0 | Total production batches created |
| `procurement_plans_created` | INTEGER | No | 0 | Total procurement plans created |
| `purchase_orders_created` | INTEGER | No | 0 | Total purchase orders created |
| `notifications_sent` | INTEGER | No | 0 | Total notifications sent |
##### Data Storage
| Column | Type | Nullable | Description |
|--------|------|----------|-------------|
| `forecast_data` | JSONB | Yes | Forecast results for downstream services |
| `run_metadata` | JSONB | Yes | Additional run metadata |
##### Error Handling
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `retry_count` | INTEGER | No | 0 | Number of retry attempts |
| `max_retries_reached` | BOOLEAN | No | false | Whether max retries was reached |
| `error_message` | TEXT | Yes | NULL | General error message |
| `error_details` | JSONB | Yes | NULL | Detailed error information |
##### External References
| Column | Type | Nullable | Description |
|--------|------|----------|-------------|
| `forecast_id` | UUID | Yes | Reference to forecast record |
| `production_schedule_id` | UUID | Yes | Reference to production schedule |
| `procurement_plan_id` | UUID | Yes | Reference to procurement plan |
##### Saga Tracking
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `saga_steps_total` | INTEGER | No | 0 | Total saga steps planned |
| `saga_steps_completed` | INTEGER | No | 0 | Saga steps completed |
##### Audit Fields
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `created_at` | TIMESTAMP | No | now() | Record creation timestamp |
| `updated_at` | TIMESTAMP | No | now() | Record last update timestamp (auto-updated) |
| `triggered_by` | VARCHAR(100) | Yes | NULL | Who/what triggered the run (indexed) |
##### Performance Metrics
| Column | Type | Nullable | Description |
|--------|------|----------|-------------|
| `fulfillment_rate` | INTEGER | Yes | Fulfillment rate percentage (0-100, indexed) |
| `on_time_delivery_rate` | INTEGER | Yes | On-time delivery rate percentage (0-100, indexed) |
| `cost_accuracy` | INTEGER | Yes | Cost accuracy percentage (0-100, indexed) |
| `quality_score` | INTEGER | Yes | Quality score (0-100, indexed) |
#### Indexes
##### Single Column Indexes
- `ix_orchestration_runs_run_number` - UNIQUE index on run_number for fast lookups
- `ix_orchestration_runs_tenant_id` - Tenant filtering
- `ix_orchestration_runs_status` - Status filtering
- `ix_orchestration_runs_started_at` - Temporal queries
- `ix_orchestration_runs_completed_at` - Temporal queries
- `ix_orchestration_runs_run_type` - Type filtering
- `ix_orchestration_runs_trigger` - Trigger source filtering
##### Composite Indexes (for common query patterns)
- `ix_orchestration_runs_tenant_status` - (tenant_id, status) - Tenant's runs by status
- `ix_orchestration_runs_tenant_type` - (tenant_id, run_type) - Tenant's runs by type
- `ix_orchestration_runs_tenant_started` - (tenant_id, started_at) - Tenant's runs by date
- `ix_orchestration_runs_status_started` - (status, started_at) - Global runs by status and date
##### Performance Metric Indexes
- `ix_orchestration_runs_fulfillment_rate` - Fulfillment rate queries
- `ix_orchestration_runs_on_time_delivery_rate` - Delivery performance queries
- `ix_orchestration_runs_cost_accuracy` - Cost tracking queries
- `ix_orchestration_runs_quality_score` - Quality filtering
## Enums
### orchestrationstatus
Represents the current status of an orchestration run.
**Values:**
- `pending` - Run is queued but not yet started
- `running` - Run is currently executing
- `completed` - Run completed successfully
- `partial_success` - Run completed but some steps failed
- `failed` - Run failed to complete
- `cancelled` - Run was cancelled
## Workflow Steps
The orchestration service coordinates multiple workflow steps in sequence:
1. **Forecasting** - Generate demand forecasts
2. **Production** - Create production schedules
3. **Procurement** - Generate procurement plans and purchase orders
4. **Notifications** - Send notifications to stakeholders
5. **AI Insights** - Generate and post AI-driven insights
Each step tracks:
- Start/completion timestamps
- Status (success/failed/skipped)
- Error messages (if applicable)
- Step-specific metrics
## Data Flow
```
Orchestration Run
1. Forecasting → forecast_data (JSONB)
2. Production → production_schedule_id (UUID)
3. Procurement → procurement_plan_id (UUID)
4. Notifications → notifications_sent (count)
5. AI Insights → ai_insights_posted (count)
```
## Query Patterns
### Common Queries
1. **Get active runs for a tenant:**
```sql
SELECT * FROM orchestration_runs
WHERE tenant_id = ? AND status IN ('pending', 'running')
ORDER BY started_at DESC;
```
*Uses: ix_orchestration_runs_tenant_status*
2. **Get run history for a date range:**
```sql
SELECT * FROM orchestration_runs
WHERE tenant_id = ? AND started_at BETWEEN ? AND ?
ORDER BY started_at DESC;
```
*Uses: ix_orchestration_runs_tenant_started*
3. **Get performance metrics summary:**
```sql
SELECT AVG(fulfillment_rate), AVG(on_time_delivery_rate),
AVG(cost_accuracy), AVG(quality_score)
FROM orchestration_runs
WHERE tenant_id = ? AND status = 'completed'
AND started_at > ?;
```
*Uses: ix_orchestration_runs_tenant_started + metric indexes*
4. **Find failed runs needing attention:**
```sql
SELECT * FROM orchestration_runs
WHERE status = 'failed' AND retry_count < 3
AND max_retries_reached = false
ORDER BY started_at DESC;
```
*Uses: ix_orchestration_runs_status*
## Migration Notes
### Consolidation Changes
The original schema was split across two migrations:
1. Base table with most fields
2. AI insights tracking added later
This consolidation:
- ✅ Combines all fields into one initial migration
- ✅ Fixes revision ID inconsistencies
- ✅ Removes duplicate index definitions
- ✅ Organizes fields logically by category
- ✅ Adds comprehensive documentation
- ✅ Improves maintainability
### Old Migration Files
The following files are superseded by `001_initial_schema.py`:
- `20251029_1700_add_orchestration_runs.py`
- `20251105_add_ai_insights_tracking.py`
**Important:** If your database was already migrated using the old files, you should not apply the new consolidated migration. The new migration is for fresh deployments or can be used after resetting the migration history.
## Best Practices
1. **Always set tenant_id** - Required for multi-tenant isolation
2. **Use run_number for user-facing displays** - More readable than UUID
3. **Track all step timing** - Helps identify bottlenecks
4. **Store detailed errors** - Use error_details JSONB for structured error data
5. **Update metrics in real-time** - Keep counts and statuses current
6. **Use saga tracking** - Helps monitor overall progress
7. **Leverage indexes** - Use composite indexes for multi-column queries
## Performance Considerations
- All timestamp columns have indexes for temporal queries
- Composite indexes optimize common multi-column filters
- JSONB columns (forecast_data, error_details, run_metadata) allow flexible data storage
- Performance metric indexes enable fast analytics queries
- Unique constraint on run_number prevents duplicates
## Future Enhancements
Potential schema improvements for future versions:
- Add foreign key constraints to external references (if services support it)
- Add partition by started_at for very high-volume deployments
- Add GIN indexes on JSONB columns for complex queries
- Add materialized views for common analytics queries