Files
bakery-ia/shared/clients
2025-08-12 18:17:30 +02:00
..
2025-08-12 18:17:30 +02:00
2025-08-12 18:17:30 +02:00
2025-08-12 18:17:30 +02:00
2025-08-12 18:17:30 +02:00
2025-08-12 18:17:30 +02:00

Enhanced Inter-Service Communication System

This directory contains the enhanced inter-service communication system that integrates with the new repository pattern architecture. The system provides circuit breakers, caching, monitoring, and event tracking for all service-to-service communications.

Architecture Overview

Base Components

  1. BaseServiceClient - Foundation class providing authentication, retries, and basic HTTP operations
  2. EnhancedServiceClient - Adds circuit breaker, caching, and monitoring capabilities
  3. ServiceRegistry - Central registry for managing all enhanced service clients

Enhanced Service Clients

Each service has a specialized enhanced client:

  • SalesServiceClient - Sales data, products, data import with optimized caching
  • ExternalServiceClient - Weather and traffic data collection with external API integration
  • EnhancedAuthServiceClient - Authentication, user management, permissions with security focus
  • EnhancedTrainingServiceClient - ML training, model management, deployment with pipeline monitoring
  • EnhancedForecastingServiceClient - Forecasting, predictions, scenarios with analytics
  • EnhancedTenantServiceClient - Tenant management, memberships, organization features
  • EnhancedNotificationServiceClient - Notifications, templates, delivery tracking

Key Features

Circuit Breaker Pattern

  • States: Closed (normal), Open (failing), Half-Open (testing recovery)
  • Configuration: Failure threshold, recovery timeout, success threshold
  • Monitoring: State changes tracked and logged

Intelligent Caching

  • TTL-based: Different cache durations for different data types
  • Invalidation: Pattern-based cache invalidation on updates
  • Statistics: Hit/miss ratios and performance metrics
  • Manual Control: Clear specific cache patterns when needed

Event Integration

  • Repository Events: Entity created/updated/deleted events
  • Correlation IDs: Track operations across services
  • Metadata: Rich event metadata for debugging and monitoring

Monitoring & Metrics

  • Request Metrics: Success/failure rates, latencies
  • Cache Metrics: Hit rates, entry counts
  • Circuit Breaker Metrics: State changes, failure counts
  • Health Checks: Per-service and aggregate health status

Usage Examples

Basic Usage with Service Registry

from shared.clients.enhanced_service_client import ServiceRegistry
from shared.config.base import BaseServiceSettings

# Initialize registry
config = BaseServiceSettings()
registry = ServiceRegistry(config, calling_service="forecasting")

# Get enhanced clients
data_client = registry.get_data_client()
auth_client = registry.get_auth_client()
training_client = registry.get_training_client()

# Use with full features
sales_data = await data_client.get_all_sales_data_with_monitoring(
    tenant_id="tenant-123",
    start_date="2024-01-01",
    end_date="2024-12-31",
    correlation_id="forecast-job-456"
)

Data Service Operations

# Get sales data with intelligent caching
sales_data = await data_client.get_sales_data_cached(
    tenant_id="tenant-123",
    start_date="2024-01-01", 
    end_date="2024-01-31",
    aggregation="daily"
)

# Upload sales data with cache invalidation and events
result = await data_client.upload_sales_data_with_events(
    tenant_id="tenant-123",
    sales_data=sales_records,
    correlation_id="data-import-789"
)

# Get weather data with caching (30 min TTL)
weather_data = await data_client.get_weather_historical_cached(
    tenant_id="tenant-123",
    start_date="2024-01-01",
    end_date="2024-01-31"
)

Authentication & User Management

# Authenticate with security monitoring
auth_result = await auth_client.authenticate_user_cached(
    email="user@example.com",
    password="password"
)

# Check permissions with caching
has_access = await auth_client.check_user_permissions_cached(
    user_id="user-123",
    tenant_id="tenant-456", 
    resource="sales_data",
    action="read"
)

# Create user with events
user = await auth_client.create_user_with_events(
    user_data={
        "email": "new@example.com",
        "name": "New User",
        "role": "analyst"
    },
    tenant_id="tenant-123",
    correlation_id="user-creation-789"
)

Training & ML Operations

# Create training job with monitoring
job = await training_client.create_training_job_with_monitoring(
    tenant_id="tenant-123",
    include_weather=True,
    include_traffic=False,
    min_data_points=30,
    correlation_id="training-pipeline-456"
)

# Get active model with caching
model = await training_client.get_active_model_for_product_cached(
    tenant_id="tenant-123",
    product_name="croissants"
)

# Deploy model with events
deployment = await training_client.deploy_model_with_events(
    tenant_id="tenant-123",
    model_id="model-789",
    correlation_id="deployment-123"
)

# Get pipeline status
status = await training_client.get_training_pipeline_status("tenant-123")

Forecasting & Predictions

# Create forecast with monitoring
forecast = await forecasting_client.create_forecast_with_monitoring(
    tenant_id="tenant-123",
    model_id="model-456",
    start_date="2024-02-01",
    end_date="2024-02-29",
    correlation_id="forecast-creation-789"
)

# Get predictions with caching
predictions = await forecasting_client.get_predictions_cached(
    tenant_id="tenant-123",
    forecast_id="forecast-456",
    start_date="2024-02-01",
    end_date="2024-02-07"
)

# Real-time prediction with caching
prediction = await forecasting_client.create_realtime_prediction_with_monitoring(
    tenant_id="tenant-123",
    model_id="model-456",
    target_date="2024-02-01",
    features={"temperature": 20, "day_of_week": 1},
    correlation_id="realtime-pred-123"
)

# Get forecasting dashboard
dashboard = await forecasting_client.get_forecasting_dashboard("tenant-123")

Tenant Management

# Create tenant with monitoring
tenant = await tenant_client.create_tenant_with_monitoring(
    name="New Bakery Chain",
    owner_id="user-123",
    description="Multi-location bakery chain",
    correlation_id="tenant-creation-456"
)

# Add member with events
membership = await tenant_client.add_tenant_member_with_events(
    tenant_id="tenant-123",
    user_id="user-456",
    role="manager",
    correlation_id="member-add-789"
)

# Get tenant analytics
analytics = await tenant_client.get_tenant_analytics("tenant-123")

Notification Management

# Send notification with monitoring
notification = await notification_client.send_notification_with_monitoring(
    recipient_id="user-123",
    notification_type="forecast_ready",
    title="Forecast Complete",
    message="Your weekly forecast is ready for review",
    tenant_id="tenant-456",
    priority="high",
    channels=["email", "in_app"],
    correlation_id="forecast-notification-789"
)

# Send bulk notification
bulk_result = await notification_client.send_bulk_notification_with_monitoring(
    recipients=["user-123", "user-456", "user-789"],
    notification_type="system_update",
    title="System Maintenance",
    message="Scheduled maintenance tonight at 2 AM",
    priority="normal",
    correlation_id="maintenance-notification-123"
)

# Get delivery analytics
analytics = await notification_client.get_delivery_analytics(
    tenant_id="tenant-123",
    start_date="2024-01-01",
    end_date="2024-01-31"
)

Health Monitoring

Individual Service Health

# Get specific service health
data_health = data_client.get_data_service_health()
auth_health = auth_client.get_auth_service_health()
training_health = training_client.get_training_service_health()

# Health includes:
# - Circuit breaker status
# - Cache statistics and configuration
# - Service-specific features
# - Supported endpoints

Registry-Level Health

# Get all service health status
all_health = registry.get_all_health_status()

# Get aggregate metrics
metrics = registry.get_aggregate_metrics()
# Returns:
# - Total cache hits/misses and hit rate
# - Circuit breaker states for all services  
# - Count of healthy vs total services

Configuration

Cache TTL Configuration

Each enhanced client has optimized cache TTL values:

# Data Service
sales_cache_ttl = 600      # 10 minutes
weather_cache_ttl = 1800   # 30 minutes  
traffic_cache_ttl = 3600   # 1 hour
product_cache_ttl = 300    # 5 minutes

# Auth Service  
user_cache_ttl = 300       # 5 minutes
token_cache_ttl = 60       # 1 minute
permission_cache_ttl = 900 # 15 minutes

# Training Service
job_cache_ttl = 180        # 3 minutes
model_cache_ttl = 600      # 10 minutes
metrics_cache_ttl = 300    # 5 minutes

# And so on...

Circuit Breaker Configuration

CircuitBreakerConfig(
    failure_threshold=5,     # Failures before opening
    recovery_timeout=60,     # Seconds before testing recovery
    success_threshold=2,     # Successes needed to close
    timeout=30              # Request timeout in seconds
)

Event System Integration

All enhanced clients integrate with the enhanced event system:

Event Types

  • EntityCreatedEvent - When entities are created
  • EntityUpdatedEvent - When entities are modified
  • EntityDeletedEvent - When entities are removed

Event Metadata

  • correlation_id - Track operations across services
  • source_service - Service that generated the event
  • destination_service - Target service
  • tenant_id - Tenant context
  • user_id - User context
  • tags - Additional metadata

Usage in Enhanced Clients

Events are automatically published for:

  • Data uploads and modifications
  • User creation/updates/deletion
  • Training job lifecycle
  • Model deployments
  • Forecast creation
  • Tenant management operations
  • Notification delivery

Error Handling & Resilience

Circuit Breaker Protection

  • Automatically stops requests when services are failing
  • Provides fallback to cached data when available
  • Gradually tests service recovery

Retry Logic

  • Exponential backoff for transient failures
  • Configurable retry counts and delays
  • Authentication token refresh on 401 errors

Cache Fallbacks

  • Returns cached data when services are unavailable
  • Graceful degradation with stale data warnings
  • Manual cache invalidation for data consistency

Integration with Repository Pattern

The enhanced clients seamlessly integrate with the new repository pattern:

Service Layer Integration

class ForecastingService:
    def __init__(self, 
                 forecast_repository: ForecastRepository,
                 service_registry: ServiceRegistry):
        self.forecast_repository = forecast_repository
        self.data_client = service_registry.get_data_client()
        self.training_client = service_registry.get_training_client()
    
    async def create_forecast(self, tenant_id: str, model_id: str):
        # Get data through enhanced client
        sales_data = await self.data_client.get_all_sales_data_with_monitoring(
            tenant_id=tenant_id,
            correlation_id=f"forecast_data_{datetime.utcnow().isoformat()}"
        )
        
        # Use repository for database operations
        forecast = await self.forecast_repository.create({
            "tenant_id": tenant_id,
            "model_id": model_id,
            "status": "pending"
        })
        
        return forecast

This completes the comprehensive enhanced inter-service communication system that integrates seamlessly with the new repository pattern architecture, providing resilience, monitoring, and advanced features for all service interactions.