Files

Claude 7b81b1a537 Create consolidated initial schema migration for orchestration service

This commit consolidates the fragmented orchestration service migrations
into a single, well-structured initial schema version file.

Changes:
- Created 001_initial_schema.py consolidating all table definitions
- Merged fields from 2 previous migrations into one comprehensive file
- Added SCHEMA_DOCUMENTATION.md with complete schema reference
- Added MIGRATION_GUIDE.md for deployment instructions

Schema includes:
- orchestration_runs table (47 columns)
- orchestrationstatus enum type
- 15 optimized indexes for query performance
- Full step tracking (forecasting, production, procurement, notifications, AI insights)
- Saga pattern support
- Performance metrics tracking
- Error handling and retry logic

Benefits:
- Better organization and documentation
- Fixes revision ID inconsistencies from old migrations
- Eliminates duplicate index definitions
- Logically categorizes fields by purpose
- Easier to understand and maintain
- Comprehensive documentation for developers

The consolidated migration provides the same final schema as the
original migration chain but in a cleaner, more maintainable format.

2025-11-05 13:41:57 +00:00

12 KiB

Raw Blame History

Orchestration Service Database Schema

Overview

This document describes the database schema for the Orchestration Service, which tracks and manages the execution of orchestration workflows across the bakery system.

Schema Version History

Initial Schema (001_initial_schema)

This is the consolidated initial schema that includes all tables, columns, indexes, and constraints from the original fragmented migrations.

Consolidated from:

20251029_1700_add_orchestration_runs.py - Base orchestration_runs table
20251105_add_ai_insights_tracking.py - AI insights tracking additions

Tables

orchestration_runs

The main audit trail table for orchestration executions. This table tracks the entire lifecycle of an orchestration run, including all workflow steps, results, and performance metrics.

Columns

Primary Identification

Column	Type	Nullable	Description
`id`	UUID	No	Primary key, auto-generated UUID
`run_number`	VARCHAR(50)	No	Unique human-readable run identifier (indexed, unique)

Run Details

Column	Type	Nullable	Default	Description
`tenant_id`	UUID	No	-	Tenant/organization identifier (indexed)
`status`	ENUM	No	'pending'	Current run status (indexed)
`run_type`	VARCHAR(50)	No	'scheduled'	Type of run: scheduled, manual, test (indexed)
`priority`	VARCHAR(20)	No	'normal'	Run priority: normal, high, critical

Timing

Column	Type	Nullable	Default	Description
`started_at`	TIMESTAMP	No	now()	When the run started (indexed)
`completed_at`	TIMESTAMP	Yes	NULL	When the run completed (indexed)
`duration_seconds`	INTEGER	Yes	NULL	Total duration in seconds

Step Tracking - Forecasting

Column	Type	Nullable	Description
`forecasting_started_at`	TIMESTAMP	Yes	When forecasting step started
`forecasting_completed_at`	TIMESTAMP	Yes	When forecasting step completed
`forecasting_status`	VARCHAR(20)	Yes	Status: success, failed, skipped
`forecasting_error`	TEXT	Yes	Error message if failed

Step Tracking - Production

Column	Type	Nullable	Description
`production_started_at`	TIMESTAMP	Yes	When production step started
`production_completed_at`	TIMESTAMP	Yes	When production step completed
`production_status`	VARCHAR(20)	Yes	Status: success, failed, skipped
`production_error`	TEXT	Yes	Error message if failed

Step Tracking - Procurement

Column	Type	Nullable	Description
`procurement_started_at`	TIMESTAMP	Yes	When procurement step started
`procurement_completed_at`	TIMESTAMP	Yes	When procurement step completed
`procurement_status`	VARCHAR(20)	Yes	Status: success, failed, skipped
`procurement_error`	TEXT	Yes	Error message if failed

Step Tracking - Notifications

Column	Type	Nullable	Description
`notification_started_at`	TIMESTAMP	Yes	When notification step started
`notification_completed_at`	TIMESTAMP	Yes	When notification step completed
`notification_status`	VARCHAR(20)	Yes	Status: success, failed, skipped
`notification_error`	TEXT	Yes	Error message if failed

Step Tracking - AI Insights

Column	Type	Nullable	Default	Description
`ai_insights_started_at`	TIMESTAMP	Yes	NULL	When AI insights step started
`ai_insights_completed_at`	TIMESTAMP	Yes	NULL	When AI insights step completed
`ai_insights_status`	VARCHAR(20)	Yes	NULL	Status: success, failed, skipped
`ai_insights_error`	TEXT	Yes	NULL	Error message if failed
`ai_insights_generated`	INTEGER	No	0	Number of AI insights generated
`ai_insights_posted`	INTEGER	No	0	Number of AI insights posted

Results Summary

Column	Type	Nullable	Description
`forecasts_generated`	INTEGER	No	Total forecasts generated
`production_batches_created`	INTEGER	No	Total production batches created
`procurement_plans_created`	INTEGER	No	Total procurement plans created
`purchase_orders_created`	INTEGER	No	Total purchase orders created
`notifications_sent`	INTEGER	No	Total notifications sent

Data Storage

Column	Type	Nullable	Description
`forecast_data`	JSONB	Yes	Forecast results for downstream services
`run_metadata`	JSONB	Yes	Additional run metadata

Error Handling

Column	Type	Nullable	Default	Description
`retry_count`	INTEGER	No	0	Number of retry attempts
`max_retries_reached`	BOOLEAN	No	false	Whether max retries was reached
`error_message`	TEXT	Yes	NULL	General error message
`error_details`	JSONB	Yes	NULL	Detailed error information

External References

Column	Type	Nullable	Description
`forecast_id`	UUID	Yes	Reference to forecast record
`production_schedule_id`	UUID	Yes	Reference to production schedule
`procurement_plan_id`	UUID	Yes	Reference to procurement plan

Saga Tracking

Column	Type	Nullable	Default	Description
`saga_steps_total`	INTEGER	No	0	Total saga steps planned
`saga_steps_completed`	INTEGER	No	0	Saga steps completed

Audit Fields

Column	Type	Nullable	Default	Description
`created_at`	TIMESTAMP	No	now()	Record creation timestamp
`updated_at`	TIMESTAMP	No	now()	Record last update timestamp (auto-updated)
`triggered_by`	VARCHAR(100)	Yes	NULL	Who/what triggered the run (indexed)

Performance Metrics

Column	Type	Nullable	Description
`fulfillment_rate`	INTEGER	Yes	Fulfillment rate percentage (0-100, indexed)
`on_time_delivery_rate`	INTEGER	Yes	On-time delivery rate percentage (0-100, indexed)
`cost_accuracy`	INTEGER	Yes	Cost accuracy percentage (0-100, indexed)
`quality_score`	INTEGER	Yes	Quality score (0-100, indexed)

Indexes

Single Column Indexes

ix_orchestration_runs_run_number - UNIQUE index on run_number for fast lookups
ix_orchestration_runs_tenant_id - Tenant filtering
ix_orchestration_runs_status - Status filtering
ix_orchestration_runs_started_at - Temporal queries
ix_orchestration_runs_completed_at - Temporal queries
ix_orchestration_runs_run_type - Type filtering
ix_orchestration_runs_trigger - Trigger source filtering

Composite Indexes (for common query patterns)

ix_orchestration_runs_tenant_status - (tenant_id, status) - Tenant's runs by status
ix_orchestration_runs_tenant_type - (tenant_id, run_type) - Tenant's runs by type
ix_orchestration_runs_tenant_started - (tenant_id, started_at) - Tenant's runs by date
ix_orchestration_runs_status_started - (status, started_at) - Global runs by status and date

Performance Metric Indexes

ix_orchestration_runs_fulfillment_rate - Fulfillment rate queries
ix_orchestration_runs_on_time_delivery_rate - Delivery performance queries
ix_orchestration_runs_cost_accuracy - Cost tracking queries
ix_orchestration_runs_quality_score - Quality filtering

Enums

orchestrationstatus

Represents the current status of an orchestration run.

Values:

pending - Run is queued but not yet started
running - Run is currently executing
completed - Run completed successfully
partial_success - Run completed but some steps failed
failed - Run failed to complete
cancelled - Run was cancelled

Workflow Steps

The orchestration service coordinates multiple workflow steps in sequence:

Forecasting - Generate demand forecasts
Production - Create production schedules
Procurement - Generate procurement plans and purchase orders
Notifications - Send notifications to stakeholders
AI Insights - Generate and post AI-driven insights

Each step tracks:

Start/completion timestamps
Status (success/failed/skipped)
Error messages (if applicable)
Step-specific metrics

Data Flow

Orchestration Run
    ↓
1. Forecasting → forecast_data (JSONB)
    ↓
2. Production → production_schedule_id (UUID)
    ↓
3. Procurement → procurement_plan_id (UUID)
    ↓
4. Notifications → notifications_sent (count)
    ↓
5. AI Insights → ai_insights_posted (count)

Query Patterns

Common Queries

Get active runs for a tenant:

SELECT * FROM orchestration_runs
WHERE tenant_id = ? AND status IN ('pending', 'running')
ORDER BY started_at DESC;

Uses: ix_orchestration_runs_tenant_status

Get run history for a date range:

SELECT * FROM orchestration_runs
WHERE tenant_id = ? AND started_at BETWEEN ? AND ?
ORDER BY started_at DESC;

Uses: ix_orchestration_runs_tenant_started

Get performance metrics summary:

SELECT AVG(fulfillment_rate), AVG(on_time_delivery_rate),
       AVG(cost_accuracy), AVG(quality_score)
FROM orchestration_runs
WHERE tenant_id = ? AND status = 'completed'
  AND started_at > ?;

Uses: ix_orchestration_runs_tenant_started + metric indexes

Find failed runs needing attention:

SELECT * FROM orchestration_runs
WHERE status = 'failed' AND retry_count < 3
  AND max_retries_reached = false
ORDER BY started_at DESC;

Uses: ix_orchestration_runs_status

Migration Notes

Consolidation Changes

The original schema was split across two migrations:

Base table with most fields
AI insights tracking added later

This consolidation:

✅ Combines all fields into one initial migration
✅ Fixes revision ID inconsistencies
✅ Removes duplicate index definitions
✅ Organizes fields logically by category
✅ Adds comprehensive documentation
✅ Improves maintainability

Old Migration Files

The following files are superseded by 001_initial_schema.py:

20251029_1700_add_orchestration_runs.py
20251105_add_ai_insights_tracking.py

Important: If your database was already migrated using the old files, you should not apply the new consolidated migration. The new migration is for fresh deployments or can be used after resetting the migration history.

Best Practices

Always set tenant_id - Required for multi-tenant isolation
Use run_number for user-facing displays - More readable than UUID
Track all step timing - Helps identify bottlenecks
Store detailed errors - Use error_details JSONB for structured error data
Update metrics in real-time - Keep counts and statuses current
Use saga tracking - Helps monitor overall progress
Leverage indexes - Use composite indexes for multi-column queries

Performance Considerations

All timestamp columns have indexes for temporal queries
Composite indexes optimize common multi-column filters
JSONB columns (forecast_data, error_details, run_metadata) allow flexible data storage
Performance metric indexes enable fast analytics queries
Unique constraint on run_number prevents duplicates

Future Enhancements

Potential schema improvements for future versions:

Add foreign key constraints to external references (if services support it)
Add partition by started_at for very high-volume deployments
Add GIN indexes on JSONB columns for complex queries
Add materialized views for common analytics queries

12 KiB Raw Blame History