Files
bakery-ia/services/forecasting/DYNAMIC_RULES_ENGINE.md
2025-11-05 13:34:56 +01:00

16 KiB

Dynamic Business Rules Engine

Overview

The Dynamic Business Rules Engine replaces hardcoded forecasting multipliers with learned values from historical data. Instead of assuming "rain = -15% impact" for all products, it learns the actual impact per product from real sales data.

Problem Statement

Current Hardcoded Approach

The forecasting service currently uses hardcoded business rules:

# Hardcoded weather adjustments
weather_adjustments = {
    'rain': 0.85,        # -15% impact
    'snow': 0.75,        # -25% impact
    'extreme_heat': 0.90 # -10% impact
}

# Hardcoded holiday adjustment
holiday_multiplier = 1.5  # +50% for all holidays

# Hardcoded event adjustment
event_multiplier = 1.3  # +30% for all events

Problems with Hardcoded Rules

  1. One-size-fits-all: Bread sales might drop 5% in rain, but pastry sales might increase 10%
  2. No adaptation: Rules never update as customer behavior changes
  3. Missing nuances: Christmas vs Easter have different impacts, but both get +50%
  4. No confidence scoring: Can't tell if a rule is based on 10 observations or 1,000
  5. Manual maintenance: Requires developer to change code to update rules

Solution: Dynamic Learning

The Dynamic Rules Engine:

  1. Learns from data: Calculates actual impact from historical sales
  2. Product-specific: Each product gets its own learned rules
  3. Statistical validation: Uses t-tests to ensure rules are significant
  4. Confidence scoring: Provides confidence levels (0-100) for each rule
  5. Automatic insights: Generates insights when learned rules differ from hardcoded assumptions
  6. Continuous improvement: Can be re-run with new data to update rules

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    Dynamic Rules Engine                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Historical Sales Data + External Data (Weather/Holidays)       │
│                         ↓                                        │
│                  Statistical Analysis                            │
│              (T-tests, Effect Sizes, p-values)                  │
│                         ↓                                        │
│              ┌──────────────────────┐                           │
│              │  Learned Rules       │                           │
│              ├──────────────────────┤                           │
│              │ • Weather impacts    │                           │
│              │ • Holiday multipliers│                           │
│              │ • Event impacts      │                           │
│              │ • Day-of-week patterns                           │
│              │ • Monthly seasonality│                           │
│              └──────────────────────┘                           │
│                         ↓                                        │
│              ┌──────────────────────┐                           │
│              │  Generated Insights  │                           │
│              ├──────────────────────┤                           │
│              │ • Rule mismatches    │                           │
│              │ • Strong patterns    │                           │
│              │ • Recommendations    │                           │
│              └──────────────────────┘                           │
│                         ↓                                        │
│            Posted to AI Insights Service                        │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Usage

Basic Usage

from app.ml.dynamic_rules_engine import DynamicRulesEngine
import pandas as pd

# Initialize engine
engine = DynamicRulesEngine()

# Prepare data
sales_data = pd.DataFrame({
    'date': [...],
    'quantity': [...]
})

external_data = pd.DataFrame({
    'date': [...],
    'weather_condition': ['rain', 'clear', 'snow', ...],
    'temperature': [15.2, 18.5, 3.1, ...],
    'is_holiday': [False, False, True, ...],
    'holiday_name': [None, None, 'Christmas', ...],
    'holiday_type': [None, None, 'religious', ...]
})

# Learn all rules
results = await engine.learn_all_rules(
    tenant_id='tenant-123',
    inventory_product_id='product-456',
    sales_data=sales_data,
    external_data=external_data,
    min_samples=10
)

# Results contain learned rules and insights
print(f"Learned {len(results['rules'])} rule categories")
print(f"Generated {len(results['insights'])} insights")
from app.ml.rules_orchestrator import RulesOrchestrator

# Initialize orchestrator
orchestrator = RulesOrchestrator(
    ai_insights_base_url="http://ai-insights-service:8000"
)

# Learn rules and automatically post insights
results = await orchestrator.learn_and_post_rules(
    tenant_id='tenant-123',
    inventory_product_id='product-456',
    sales_data=sales_data,
    external_data=external_data
)

print(f"Insights posted: {results['insights_posted']}")
print(f"Insights failed: {results['insights_failed']}")

# Get learned rules for forecasting
rules = await orchestrator.get_learned_rules_for_forecasting('product-456')

# Get specific multiplier with fallback
rain_multiplier = orchestrator.get_rule_multiplier(
    inventory_product_id='product-456',
    rule_type='weather',
    key='rain',
    default=0.85  # Fallback to hardcoded if not learned
)

Learned Rules Structure

Weather Rules

{
    "weather": {
        "baseline_avg": 105.3,  # Average sales on clear days
        "conditions": {
            "rain": {
                "learned_multiplier": 0.88,  # Actual impact: -12%
                "learned_impact_pct": -12.0,
                "sample_size": 37,
                "avg_quantity": 92.7,
                "p_value": 0.003,
                "significant": true
            },
            "snow": {
                "learned_multiplier": 0.73,  # Actual impact: -27%
                "learned_impact_pct": -27.0,
                "sample_size": 12,
                "avg_quantity": 76.9,
                "p_value": 0.001,
                "significant": true
            }
        }
    }
}

Holiday Rules

{
    "holidays": {
        "baseline_avg": 100.0,  # Non-holiday average
        "hardcoded_multiplier": 1.5,  # Current +50%
        "holiday_types": {
            "religious": {
                "learned_multiplier": 1.68,  # Actual: +68%
                "learned_impact_pct": 68.0,
                "sample_size": 8,
                "avg_quantity": 168.0,
                "p_value": 0.002,
                "significant": true
            },
            "national": {
                "learned_multiplier": 1.25,  # Actual: +25%
                "learned_impact_pct": 25.0,
                "sample_size": 5,
                "avg_quantity": 125.0,
                "p_value": 0.045,
                "significant": true
            }
        },
        "overall_learned_multiplier": 1.52
    }
}

Day-of-Week Rules

{
    "day_of_week": {
        "overall_avg": 100.0,
        "days": {
            "Monday": {
                "day_of_week": 0,
                "learned_multiplier": 0.85,
                "impact_pct": -15.0,
                "avg_quantity": 85.0,
                "std_quantity": 12.3,
                "sample_size": 52,
                "coefficient_of_variation": 0.145
            },
            "Saturday": {
                "day_of_week": 5,
                "learned_multiplier": 1.32,
                "impact_pct": 32.0,
                "avg_quantity": 132.0,
                "std_quantity": 18.7,
                "sample_size": 52,
                "coefficient_of_variation": 0.142
            }
        }
    }
}

Generated Insights Examples

Weather Rule Mismatch

{
    "type": "optimization",
    "priority": "high",
    "category": "forecasting",
    "title": "Weather Rule Mismatch: Rain",
    "description": "Learned rain impact is -12.0% vs hardcoded -15.0%. Updating rule could improve forecast accuracy by 3.0%.",
    "impact_type": "forecast_improvement",
    "impact_value": 3.0,
    "impact_unit": "percentage_points",
    "confidence": 85,
    "metrics_json": {
        "weather_condition": "rain",
        "learned_impact_pct": -12.0,
        "hardcoded_impact_pct": -15.0,
        "difference_pct": 3.0,
        "baseline_avg": 105.3,
        "condition_avg": 92.7,
        "sample_size": 37,
        "p_value": 0.003
    },
    "actionable": true,
    "recommendation_actions": [
        {
            "label": "Update Weather Rule",
            "action": "update_weather_multiplier",
            "params": {
                "condition": "rain",
                "new_multiplier": 0.88
            }
        }
    ]
}

Holiday Optimization

{
    "type": "recommendation",
    "priority": "high",
    "category": "forecasting",
    "title": "Holiday Rule Optimization: religious",
    "description": "religious shows 68.0% impact vs hardcoded +50%. Using learned multiplier 1.68x could improve forecast accuracy.",
    "impact_type": "forecast_improvement",
    "impact_value": 18.0,
    "confidence": 82,
    "metrics_json": {
        "holiday_type": "religious",
        "learned_multiplier": 1.68,
        "hardcoded_multiplier": 1.5,
        "learned_impact_pct": 68.0,
        "hardcoded_impact_pct": 50.0,
        "sample_size": 8
    },
    "actionable": true,
    "recommendation_actions": [
        {
            "label": "Update Holiday Rule",
            "action": "update_holiday_multiplier",
            "params": {
                "holiday_type": "religious",
                "new_multiplier": 1.68
            }
        }
    ]
}

Strong Day-of-Week Pattern

{
    "type": "insight",
    "priority": "medium",
    "category": "forecasting",
    "title": "Saturday Pattern: 32% Higher",
    "description": "Saturday sales average 132.0 units (+32.0% vs weekly average 100.0). Consider this pattern in production planning.",
    "impact_type": "operational_insight",
    "impact_value": 32.0,
    "confidence": 88,
    "metrics_json": {
        "day_of_week": "Saturday",
        "day_multiplier": 1.32,
        "impact_pct": 32.0,
        "day_avg": 132.0,
        "overall_avg": 100.0,
        "sample_size": 52
    },
    "actionable": true,
    "recommendation_actions": [
        {
            "label": "Adjust Production Schedule",
            "action": "adjust_weekly_production",
            "params": {
                "day": "Saturday",
                "multiplier": 1.32
            }
        }
    ]
}

Confidence Scoring

Confidence (0-100) is calculated based on:

  1. Sample Size (0-50 points):

    • 100+ samples: 50 points
    • 50-99 samples: 40 points
    • 30-49 samples: 30 points
    • 20-29 samples: 20 points
    • <20 samples: 10 points
  2. Statistical Significance (0-50 points):

    • p < 0.001: 50 points
    • p < 0.01: 45 points
    • p < 0.05: 35 points
    • p < 0.1: 20 points
    • p >= 0.1: 10 points
confidence = min(100, sample_score + significance_score)

Examples:

  • 150 samples, p=0.001 → 100 confidence
  • 50 samples, p=0.03 → 75 confidence
  • 15 samples, p=0.12 → 20 confidence (low)

Integration with Forecasting

Option 1: Replace Hardcoded Values

# Before (hardcoded)
if weather == 'rain':
    forecast *= 0.85

# After (learned)
rain_multiplier = rules_engine.get_rule(
    inventory_product_id=product_id,
    rule_type='weather',
    key='rain'
) or 0.85  # Fallback to hardcoded

if weather == 'rain':
    forecast *= rain_multiplier

Option 2: Prophet Regressor Integration

# Export learned rules
rules = orchestrator.get_learned_rules_for_forecasting(product_id)

# Apply as Prophet regressors
for condition, rule in rules['weather']['conditions'].items():
    # Create binary regressor for each condition
    df[f'is_{condition}'] = (df['weather_condition'] == condition).astype(int)
    # Weight by learned multiplier
    df[f'{condition}_weighted'] = df[f'is_{condition}'] * rule['learned_multiplier']

    # Add to Prophet
    prophet.add_regressor(f'{condition}_weighted')

Periodic Updates

Rules should be re-learned periodically as new data accumulates:

# Weekly or monthly update
results = await orchestrator.update_rules_periodically(
    tenant_id='tenant-123',
    inventory_product_id='product-456',
    sales_data=updated_sales_data,
    external_data=updated_external_data
)

# New insights will be posted if rules have changed significantly
print(f"Rules updated, {results['insights_posted']} new insights")

API Integration

The Rules Orchestrator automatically posts insights to the AI Insights Service:

# POST to /api/v1/ai-insights/tenants/{tenant_id}/insights
{
    "tenant_id": "tenant-123",
    "type": "optimization",
    "priority": "high",
    "category": "forecasting",
    "title": "Weather Rule Mismatch: Rain",
    "description": "...",
    "confidence": 85,
    "metrics_json": {...},
    "actionable": true,
    "recommendation_actions": [...]
}

Insights can then be:

  1. Viewed in the AI Insights frontend page
  2. Retrieved by orchestration service for automated application
  3. Tracked for feedback and learning

Testing

Run comprehensive tests:

cd services/forecasting
pytest tests/test_dynamic_rules_engine.py -v

Tests cover:

  • Weather rules learning
  • Holiday rules learning
  • Day-of-week patterns
  • Monthly seasonality
  • Insight generation
  • Confidence calculation
  • Insufficient sample handling

Performance

Learning Time: ~1-2 seconds for 1 year of daily data (365 observations)

Memory: ~50 MB for rules storage per 1,000 products

Accuracy Improvement: Expected 5-15% MAPE reduction by using learned rules vs hardcoded

Minimum Data Requirements

Rule Type Minimum Samples Recommended
Weather (per condition) 10 days 30+ days
Holiday (per type) 5 occurrences 10+ occurrences
Event (per type) 10 events 20+ events
Day-of-week 10 weeks 26+ weeks
Monthly 2 months 12+ months

Overall: 3-6 months of historical data recommended for reliable rules.

Limitations

  1. Cold Start: New products need 60-90 days before reliable rules can be learned
  2. Rare Events: Conditions that occur <10 times won't have statistically significant rules
  3. Distribution Shift: Rules assume future behavior similar to historical patterns
  4. External Factors: Can't learn from factors not tracked in external_data

Future Enhancements

  1. Transfer Learning: Use rules from similar products for cold start
  2. Bayesian Updates: Incrementally update rules as new data arrives
  3. Hierarchical Rules: Learn category-level rules when product-level data insufficient
  4. Interaction Effects: Learn combined impacts (e.g., "rainy Saturday" vs "rainy Monday")
  5. Drift Detection: Alert when learned rules become invalid due to behavior changes

Summary

The Dynamic Business Rules Engine transforms hardcoded assumptions into data-driven, product-specific, continuously-improving forecasting rules. By learning from actual historical patterns and automatically generating insights, it enables the forecasting service to adapt to real customer behavior and improve accuracy over time.

Key Benefits:

  • 5-15% MAPE improvement
  • Product-specific customization
  • Automatic insight generation
  • Statistical validation
  • Continuous improvement
  • Zero manual rule maintenance