Files
bakery-ia/services/orchestrator/migrations/SCHEMA_DOCUMENTATION.md
Claude 7b81b1a537 Create consolidated initial schema migration for orchestration service
This commit consolidates the fragmented orchestration service migrations
into a single, well-structured initial schema version file.

Changes:
- Created 001_initial_schema.py consolidating all table definitions
- Merged fields from 2 previous migrations into one comprehensive file
- Added SCHEMA_DOCUMENTATION.md with complete schema reference
- Added MIGRATION_GUIDE.md for deployment instructions

Schema includes:
- orchestration_runs table (47 columns)
- orchestrationstatus enum type
- 15 optimized indexes for query performance
- Full step tracking (forecasting, production, procurement, notifications, AI insights)
- Saga pattern support
- Performance metrics tracking
- Error handling and retry logic

Benefits:
- Better organization and documentation
- Fixes revision ID inconsistencies from old migrations
- Eliminates duplicate index definitions
- Logically categorizes fields by purpose
- Easier to understand and maintain
- Comprehensive documentation for developers

The consolidated migration provides the same final schema as the
original migration chain but in a cleaner, more maintainable format.
2025-11-05 13:41:57 +00:00

12 KiB

Orchestration Service Database Schema

Overview

This document describes the database schema for the Orchestration Service, which tracks and manages the execution of orchestration workflows across the bakery system.

Schema Version History

Initial Schema (001_initial_schema)

This is the consolidated initial schema that includes all tables, columns, indexes, and constraints from the original fragmented migrations.

Consolidated from:

  • 20251029_1700_add_orchestration_runs.py - Base orchestration_runs table
  • 20251105_add_ai_insights_tracking.py - AI insights tracking additions

Tables

orchestration_runs

The main audit trail table for orchestration executions. This table tracks the entire lifecycle of an orchestration run, including all workflow steps, results, and performance metrics.

Columns

Primary Identification
Column Type Nullable Description
id UUID No Primary key, auto-generated UUID
run_number VARCHAR(50) No Unique human-readable run identifier (indexed, unique)
Run Details
Column Type Nullable Default Description
tenant_id UUID No - Tenant/organization identifier (indexed)
status ENUM No 'pending' Current run status (indexed)
run_type VARCHAR(50) No 'scheduled' Type of run: scheduled, manual, test (indexed)
priority VARCHAR(20) No 'normal' Run priority: normal, high, critical
Timing
Column Type Nullable Default Description
started_at TIMESTAMP No now() When the run started (indexed)
completed_at TIMESTAMP Yes NULL When the run completed (indexed)
duration_seconds INTEGER Yes NULL Total duration in seconds
Step Tracking - Forecasting
Column Type Nullable Description
forecasting_started_at TIMESTAMP Yes When forecasting step started
forecasting_completed_at TIMESTAMP Yes When forecasting step completed
forecasting_status VARCHAR(20) Yes Status: success, failed, skipped
forecasting_error TEXT Yes Error message if failed
Step Tracking - Production
Column Type Nullable Description
production_started_at TIMESTAMP Yes When production step started
production_completed_at TIMESTAMP Yes When production step completed
production_status VARCHAR(20) Yes Status: success, failed, skipped
production_error TEXT Yes Error message if failed
Step Tracking - Procurement
Column Type Nullable Description
procurement_started_at TIMESTAMP Yes When procurement step started
procurement_completed_at TIMESTAMP Yes When procurement step completed
procurement_status VARCHAR(20) Yes Status: success, failed, skipped
procurement_error TEXT Yes Error message if failed
Step Tracking - Notifications
Column Type Nullable Description
notification_started_at TIMESTAMP Yes When notification step started
notification_completed_at TIMESTAMP Yes When notification step completed
notification_status VARCHAR(20) Yes Status: success, failed, skipped
notification_error TEXT Yes Error message if failed
Step Tracking - AI Insights
Column Type Nullable Default Description
ai_insights_started_at TIMESTAMP Yes NULL When AI insights step started
ai_insights_completed_at TIMESTAMP Yes NULL When AI insights step completed
ai_insights_status VARCHAR(20) Yes NULL Status: success, failed, skipped
ai_insights_error TEXT Yes NULL Error message if failed
ai_insights_generated INTEGER No 0 Number of AI insights generated
ai_insights_posted INTEGER No 0 Number of AI insights posted
Results Summary
Column Type Nullable Default Description
forecasts_generated INTEGER No 0 Total forecasts generated
production_batches_created INTEGER No 0 Total production batches created
procurement_plans_created INTEGER No 0 Total procurement plans created
purchase_orders_created INTEGER No 0 Total purchase orders created
notifications_sent INTEGER No 0 Total notifications sent
Data Storage
Column Type Nullable Description
forecast_data JSONB Yes Forecast results for downstream services
run_metadata JSONB Yes Additional run metadata
Error Handling
Column Type Nullable Default Description
retry_count INTEGER No 0 Number of retry attempts
max_retries_reached BOOLEAN No false Whether max retries was reached
error_message TEXT Yes NULL General error message
error_details JSONB Yes NULL Detailed error information
External References
Column Type Nullable Description
forecast_id UUID Yes Reference to forecast record
production_schedule_id UUID Yes Reference to production schedule
procurement_plan_id UUID Yes Reference to procurement plan
Saga Tracking
Column Type Nullable Default Description
saga_steps_total INTEGER No 0 Total saga steps planned
saga_steps_completed INTEGER No 0 Saga steps completed
Audit Fields
Column Type Nullable Default Description
created_at TIMESTAMP No now() Record creation timestamp
updated_at TIMESTAMP No now() Record last update timestamp (auto-updated)
triggered_by VARCHAR(100) Yes NULL Who/what triggered the run (indexed)
Performance Metrics
Column Type Nullable Description
fulfillment_rate INTEGER Yes Fulfillment rate percentage (0-100, indexed)
on_time_delivery_rate INTEGER Yes On-time delivery rate percentage (0-100, indexed)
cost_accuracy INTEGER Yes Cost accuracy percentage (0-100, indexed)
quality_score INTEGER Yes Quality score (0-100, indexed)

Indexes

Single Column Indexes
  • ix_orchestration_runs_run_number - UNIQUE index on run_number for fast lookups
  • ix_orchestration_runs_tenant_id - Tenant filtering
  • ix_orchestration_runs_status - Status filtering
  • ix_orchestration_runs_started_at - Temporal queries
  • ix_orchestration_runs_completed_at - Temporal queries
  • ix_orchestration_runs_run_type - Type filtering
  • ix_orchestration_runs_trigger - Trigger source filtering
Composite Indexes (for common query patterns)
  • ix_orchestration_runs_tenant_status - (tenant_id, status) - Tenant's runs by status
  • ix_orchestration_runs_tenant_type - (tenant_id, run_type) - Tenant's runs by type
  • ix_orchestration_runs_tenant_started - (tenant_id, started_at) - Tenant's runs by date
  • ix_orchestration_runs_status_started - (status, started_at) - Global runs by status and date
Performance Metric Indexes
  • ix_orchestration_runs_fulfillment_rate - Fulfillment rate queries
  • ix_orchestration_runs_on_time_delivery_rate - Delivery performance queries
  • ix_orchestration_runs_cost_accuracy - Cost tracking queries
  • ix_orchestration_runs_quality_score - Quality filtering

Enums

orchestrationstatus

Represents the current status of an orchestration run.

Values:

  • pending - Run is queued but not yet started
  • running - Run is currently executing
  • completed - Run completed successfully
  • partial_success - Run completed but some steps failed
  • failed - Run failed to complete
  • cancelled - Run was cancelled

Workflow Steps

The orchestration service coordinates multiple workflow steps in sequence:

  1. Forecasting - Generate demand forecasts
  2. Production - Create production schedules
  3. Procurement - Generate procurement plans and purchase orders
  4. Notifications - Send notifications to stakeholders
  5. AI Insights - Generate and post AI-driven insights

Each step tracks:

  • Start/completion timestamps
  • Status (success/failed/skipped)
  • Error messages (if applicable)
  • Step-specific metrics

Data Flow

Orchestration Run
    ↓
1. Forecasting → forecast_data (JSONB)
    ↓
2. Production → production_schedule_id (UUID)
    ↓
3. Procurement → procurement_plan_id (UUID)
    ↓
4. Notifications → notifications_sent (count)
    ↓
5. AI Insights → ai_insights_posted (count)

Query Patterns

Common Queries

  1. Get active runs for a tenant:

    SELECT * FROM orchestration_runs
    WHERE tenant_id = ? AND status IN ('pending', 'running')
    ORDER BY started_at DESC;
    

    Uses: ix_orchestration_runs_tenant_status

  2. Get run history for a date range:

    SELECT * FROM orchestration_runs
    WHERE tenant_id = ? AND started_at BETWEEN ? AND ?
    ORDER BY started_at DESC;
    

    Uses: ix_orchestration_runs_tenant_started

  3. Get performance metrics summary:

    SELECT AVG(fulfillment_rate), AVG(on_time_delivery_rate),
           AVG(cost_accuracy), AVG(quality_score)
    FROM orchestration_runs
    WHERE tenant_id = ? AND status = 'completed'
      AND started_at > ?;
    

    Uses: ix_orchestration_runs_tenant_started + metric indexes

  4. Find failed runs needing attention:

    SELECT * FROM orchestration_runs
    WHERE status = 'failed' AND retry_count < 3
      AND max_retries_reached = false
    ORDER BY started_at DESC;
    

    Uses: ix_orchestration_runs_status

Migration Notes

Consolidation Changes

The original schema was split across two migrations:

  1. Base table with most fields
  2. AI insights tracking added later

This consolidation:

  • Combines all fields into one initial migration
  • Fixes revision ID inconsistencies
  • Removes duplicate index definitions
  • Organizes fields logically by category
  • Adds comprehensive documentation
  • Improves maintainability

Old Migration Files

The following files are superseded by 001_initial_schema.py:

  • 20251029_1700_add_orchestration_runs.py
  • 20251105_add_ai_insights_tracking.py

Important: If your database was already migrated using the old files, you should not apply the new consolidated migration. The new migration is for fresh deployments or can be used after resetting the migration history.

Best Practices

  1. Always set tenant_id - Required for multi-tenant isolation
  2. Use run_number for user-facing displays - More readable than UUID
  3. Track all step timing - Helps identify bottlenecks
  4. Store detailed errors - Use error_details JSONB for structured error data
  5. Update metrics in real-time - Keep counts and statuses current
  6. Use saga tracking - Helps monitor overall progress
  7. Leverage indexes - Use composite indexes for multi-column queries

Performance Considerations

  • All timestamp columns have indexes for temporal queries
  • Composite indexes optimize common multi-column filters
  • JSONB columns (forecast_data, error_details, run_metadata) allow flexible data storage
  • Performance metric indexes enable fast analytics queries
  • Unique constraint on run_number prevents duplicates

Future Enhancements

Potential schema improvements for future versions:

  • Add foreign key constraints to external references (if services support it)
  • Add partition by started_at for very high-volume deployments
  • Add GIN indexes on JSONB columns for complex queries
  • Add materialized views for common analytics queries