REFACTOR - Database logic
This commit is contained in:
390
shared/clients/README.md
Normal file
390
shared/clients/README.md
Normal file
@@ -0,0 +1,390 @@
|
||||
# Enhanced Inter-Service Communication System
|
||||
|
||||
This directory contains the enhanced inter-service communication system that integrates with the new repository pattern architecture. The system provides circuit breakers, caching, monitoring, and event tracking for all service-to-service communications.
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
### Base Components
|
||||
|
||||
1. **BaseServiceClient** - Foundation class providing authentication, retries, and basic HTTP operations
|
||||
2. **EnhancedServiceClient** - Adds circuit breaker, caching, and monitoring capabilities
|
||||
3. **ServiceRegistry** - Central registry for managing all enhanced service clients
|
||||
|
||||
### Enhanced Service Clients
|
||||
|
||||
Each service has a specialized enhanced client:
|
||||
|
||||
- **EnhancedDataServiceClient** - Sales data, weather, traffic, products with optimized caching
|
||||
- **EnhancedAuthServiceClient** - Authentication, user management, permissions with security focus
|
||||
- **EnhancedTrainingServiceClient** - ML training, model management, deployment with pipeline monitoring
|
||||
- **EnhancedForecastingServiceClient** - Forecasting, predictions, scenarios with analytics
|
||||
- **EnhancedTenantServiceClient** - Tenant management, memberships, organization features
|
||||
- **EnhancedNotificationServiceClient** - Notifications, templates, delivery tracking
|
||||
|
||||
## Key Features
|
||||
|
||||
### Circuit Breaker Pattern
|
||||
- **States**: Closed (normal), Open (failing), Half-Open (testing recovery)
|
||||
- **Configuration**: Failure threshold, recovery timeout, success threshold
|
||||
- **Monitoring**: State changes tracked and logged
|
||||
|
||||
### Intelligent Caching
|
||||
- **TTL-based**: Different cache durations for different data types
|
||||
- **Invalidation**: Pattern-based cache invalidation on updates
|
||||
- **Statistics**: Hit/miss ratios and performance metrics
|
||||
- **Manual Control**: Clear specific cache patterns when needed
|
||||
|
||||
### Event Integration
|
||||
- **Repository Events**: Entity created/updated/deleted events
|
||||
- **Correlation IDs**: Track operations across services
|
||||
- **Metadata**: Rich event metadata for debugging and monitoring
|
||||
|
||||
### Monitoring & Metrics
|
||||
- **Request Metrics**: Success/failure rates, latencies
|
||||
- **Cache Metrics**: Hit rates, entry counts
|
||||
- **Circuit Breaker Metrics**: State changes, failure counts
|
||||
- **Health Checks**: Per-service and aggregate health status
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Basic Usage with Service Registry
|
||||
|
||||
```python
|
||||
from shared.clients.enhanced_service_client import ServiceRegistry
|
||||
from shared.config.base import BaseServiceSettings
|
||||
|
||||
# Initialize registry
|
||||
config = BaseServiceSettings()
|
||||
registry = ServiceRegistry(config, calling_service="forecasting")
|
||||
|
||||
# Get enhanced clients
|
||||
data_client = registry.get_data_client()
|
||||
auth_client = registry.get_auth_client()
|
||||
training_client = registry.get_training_client()
|
||||
|
||||
# Use with full features
|
||||
sales_data = await data_client.get_all_sales_data_with_monitoring(
|
||||
tenant_id="tenant-123",
|
||||
start_date="2024-01-01",
|
||||
end_date="2024-12-31",
|
||||
correlation_id="forecast-job-456"
|
||||
)
|
||||
```
|
||||
|
||||
### Data Service Operations
|
||||
|
||||
```python
|
||||
# Get sales data with intelligent caching
|
||||
sales_data = await data_client.get_sales_data_cached(
|
||||
tenant_id="tenant-123",
|
||||
start_date="2024-01-01",
|
||||
end_date="2024-01-31",
|
||||
aggregation="daily"
|
||||
)
|
||||
|
||||
# Upload sales data with cache invalidation and events
|
||||
result = await data_client.upload_sales_data_with_events(
|
||||
tenant_id="tenant-123",
|
||||
sales_data=sales_records,
|
||||
correlation_id="data-import-789"
|
||||
)
|
||||
|
||||
# Get weather data with caching (30 min TTL)
|
||||
weather_data = await data_client.get_weather_historical_cached(
|
||||
tenant_id="tenant-123",
|
||||
start_date="2024-01-01",
|
||||
end_date="2024-01-31"
|
||||
)
|
||||
```
|
||||
|
||||
### Authentication & User Management
|
||||
|
||||
```python
|
||||
# Authenticate with security monitoring
|
||||
auth_result = await auth_client.authenticate_user_cached(
|
||||
email="user@example.com",
|
||||
password="password"
|
||||
)
|
||||
|
||||
# Check permissions with caching
|
||||
has_access = await auth_client.check_user_permissions_cached(
|
||||
user_id="user-123",
|
||||
tenant_id="tenant-456",
|
||||
resource="sales_data",
|
||||
action="read"
|
||||
)
|
||||
|
||||
# Create user with events
|
||||
user = await auth_client.create_user_with_events(
|
||||
user_data={
|
||||
"email": "new@example.com",
|
||||
"name": "New User",
|
||||
"role": "analyst"
|
||||
},
|
||||
tenant_id="tenant-123",
|
||||
correlation_id="user-creation-789"
|
||||
)
|
||||
```
|
||||
|
||||
### Training & ML Operations
|
||||
|
||||
```python
|
||||
# Create training job with monitoring
|
||||
job = await training_client.create_training_job_with_monitoring(
|
||||
tenant_id="tenant-123",
|
||||
include_weather=True,
|
||||
include_traffic=False,
|
||||
min_data_points=30,
|
||||
correlation_id="training-pipeline-456"
|
||||
)
|
||||
|
||||
# Get active model with caching
|
||||
model = await training_client.get_active_model_for_product_cached(
|
||||
tenant_id="tenant-123",
|
||||
product_name="croissants"
|
||||
)
|
||||
|
||||
# Deploy model with events
|
||||
deployment = await training_client.deploy_model_with_events(
|
||||
tenant_id="tenant-123",
|
||||
model_id="model-789",
|
||||
correlation_id="deployment-123"
|
||||
)
|
||||
|
||||
# Get pipeline status
|
||||
status = await training_client.get_training_pipeline_status("tenant-123")
|
||||
```
|
||||
|
||||
### Forecasting & Predictions
|
||||
|
||||
```python
|
||||
# Create forecast with monitoring
|
||||
forecast = await forecasting_client.create_forecast_with_monitoring(
|
||||
tenant_id="tenant-123",
|
||||
model_id="model-456",
|
||||
start_date="2024-02-01",
|
||||
end_date="2024-02-29",
|
||||
correlation_id="forecast-creation-789"
|
||||
)
|
||||
|
||||
# Get predictions with caching
|
||||
predictions = await forecasting_client.get_predictions_cached(
|
||||
tenant_id="tenant-123",
|
||||
forecast_id="forecast-456",
|
||||
start_date="2024-02-01",
|
||||
end_date="2024-02-07"
|
||||
)
|
||||
|
||||
# Real-time prediction with caching
|
||||
prediction = await forecasting_client.create_realtime_prediction_with_monitoring(
|
||||
tenant_id="tenant-123",
|
||||
model_id="model-456",
|
||||
target_date="2024-02-01",
|
||||
features={"temperature": 20, "day_of_week": 1},
|
||||
correlation_id="realtime-pred-123"
|
||||
)
|
||||
|
||||
# Get forecasting dashboard
|
||||
dashboard = await forecasting_client.get_forecasting_dashboard("tenant-123")
|
||||
```
|
||||
|
||||
### Tenant Management
|
||||
|
||||
```python
|
||||
# Create tenant with monitoring
|
||||
tenant = await tenant_client.create_tenant_with_monitoring(
|
||||
name="New Bakery Chain",
|
||||
owner_id="user-123",
|
||||
description="Multi-location bakery chain",
|
||||
correlation_id="tenant-creation-456"
|
||||
)
|
||||
|
||||
# Add member with events
|
||||
membership = await tenant_client.add_tenant_member_with_events(
|
||||
tenant_id="tenant-123",
|
||||
user_id="user-456",
|
||||
role="manager",
|
||||
correlation_id="member-add-789"
|
||||
)
|
||||
|
||||
# Get tenant analytics
|
||||
analytics = await tenant_client.get_tenant_analytics("tenant-123")
|
||||
```
|
||||
|
||||
### Notification Management
|
||||
|
||||
```python
|
||||
# Send notification with monitoring
|
||||
notification = await notification_client.send_notification_with_monitoring(
|
||||
recipient_id="user-123",
|
||||
notification_type="forecast_ready",
|
||||
title="Forecast Complete",
|
||||
message="Your weekly forecast is ready for review",
|
||||
tenant_id="tenant-456",
|
||||
priority="high",
|
||||
channels=["email", "in_app"],
|
||||
correlation_id="forecast-notification-789"
|
||||
)
|
||||
|
||||
# Send bulk notification
|
||||
bulk_result = await notification_client.send_bulk_notification_with_monitoring(
|
||||
recipients=["user-123", "user-456", "user-789"],
|
||||
notification_type="system_update",
|
||||
title="System Maintenance",
|
||||
message="Scheduled maintenance tonight at 2 AM",
|
||||
priority="normal",
|
||||
correlation_id="maintenance-notification-123"
|
||||
)
|
||||
|
||||
# Get delivery analytics
|
||||
analytics = await notification_client.get_delivery_analytics(
|
||||
tenant_id="tenant-123",
|
||||
start_date="2024-01-01",
|
||||
end_date="2024-01-31"
|
||||
)
|
||||
```
|
||||
|
||||
## Health Monitoring
|
||||
|
||||
### Individual Service Health
|
||||
|
||||
```python
|
||||
# Get specific service health
|
||||
data_health = data_client.get_data_service_health()
|
||||
auth_health = auth_client.get_auth_service_health()
|
||||
training_health = training_client.get_training_service_health()
|
||||
|
||||
# Health includes:
|
||||
# - Circuit breaker status
|
||||
# - Cache statistics and configuration
|
||||
# - Service-specific features
|
||||
# - Supported endpoints
|
||||
```
|
||||
|
||||
### Registry-Level Health
|
||||
|
||||
```python
|
||||
# Get all service health status
|
||||
all_health = registry.get_all_health_status()
|
||||
|
||||
# Get aggregate metrics
|
||||
metrics = registry.get_aggregate_metrics()
|
||||
# Returns:
|
||||
# - Total cache hits/misses and hit rate
|
||||
# - Circuit breaker states for all services
|
||||
# - Count of healthy vs total services
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Cache TTL Configuration
|
||||
|
||||
Each enhanced client has optimized cache TTL values:
|
||||
|
||||
```python
|
||||
# Data Service
|
||||
sales_cache_ttl = 600 # 10 minutes
|
||||
weather_cache_ttl = 1800 # 30 minutes
|
||||
traffic_cache_ttl = 3600 # 1 hour
|
||||
product_cache_ttl = 300 # 5 minutes
|
||||
|
||||
# Auth Service
|
||||
user_cache_ttl = 300 # 5 minutes
|
||||
token_cache_ttl = 60 # 1 minute
|
||||
permission_cache_ttl = 900 # 15 minutes
|
||||
|
||||
# Training Service
|
||||
job_cache_ttl = 180 # 3 minutes
|
||||
model_cache_ttl = 600 # 10 minutes
|
||||
metrics_cache_ttl = 300 # 5 minutes
|
||||
|
||||
# And so on...
|
||||
```
|
||||
|
||||
### Circuit Breaker Configuration
|
||||
|
||||
```python
|
||||
CircuitBreakerConfig(
|
||||
failure_threshold=5, # Failures before opening
|
||||
recovery_timeout=60, # Seconds before testing recovery
|
||||
success_threshold=2, # Successes needed to close
|
||||
timeout=30 # Request timeout in seconds
|
||||
)
|
||||
```
|
||||
|
||||
## Event System Integration
|
||||
|
||||
All enhanced clients integrate with the enhanced event system:
|
||||
|
||||
### Event Types
|
||||
- **EntityCreatedEvent** - When entities are created
|
||||
- **EntityUpdatedEvent** - When entities are modified
|
||||
- **EntityDeletedEvent** - When entities are removed
|
||||
|
||||
### Event Metadata
|
||||
- **correlation_id** - Track operations across services
|
||||
- **source_service** - Service that generated the event
|
||||
- **destination_service** - Target service
|
||||
- **tenant_id** - Tenant context
|
||||
- **user_id** - User context
|
||||
- **tags** - Additional metadata
|
||||
|
||||
### Usage in Enhanced Clients
|
||||
Events are automatically published for:
|
||||
- Data uploads and modifications
|
||||
- User creation/updates/deletion
|
||||
- Training job lifecycle
|
||||
- Model deployments
|
||||
- Forecast creation
|
||||
- Tenant management operations
|
||||
- Notification delivery
|
||||
|
||||
## Error Handling & Resilience
|
||||
|
||||
### Circuit Breaker Protection
|
||||
- Automatically stops requests when services are failing
|
||||
- Provides fallback to cached data when available
|
||||
- Gradually tests service recovery
|
||||
|
||||
### Retry Logic
|
||||
- Exponential backoff for transient failures
|
||||
- Configurable retry counts and delays
|
||||
- Authentication token refresh on 401 errors
|
||||
|
||||
### Cache Fallbacks
|
||||
- Returns cached data when services are unavailable
|
||||
- Graceful degradation with stale data warnings
|
||||
- Manual cache invalidation for data consistency
|
||||
|
||||
## Integration with Repository Pattern
|
||||
|
||||
The enhanced clients seamlessly integrate with the new repository pattern:
|
||||
|
||||
### Service Layer Integration
|
||||
```python
|
||||
class ForecastingService:
|
||||
def __init__(self,
|
||||
forecast_repository: ForecastRepository,
|
||||
service_registry: ServiceRegistry):
|
||||
self.forecast_repository = forecast_repository
|
||||
self.data_client = service_registry.get_data_client()
|
||||
self.training_client = service_registry.get_training_client()
|
||||
|
||||
async def create_forecast(self, tenant_id: str, model_id: str):
|
||||
# Get data through enhanced client
|
||||
sales_data = await self.data_client.get_all_sales_data_with_monitoring(
|
||||
tenant_id=tenant_id,
|
||||
correlation_id=f"forecast_data_{datetime.utcnow().isoformat()}"
|
||||
)
|
||||
|
||||
# Use repository for database operations
|
||||
forecast = await self.forecast_repository.create({
|
||||
"tenant_id": tenant_id,
|
||||
"model_id": model_id,
|
||||
"status": "pending"
|
||||
})
|
||||
|
||||
return forecast
|
||||
```
|
||||
|
||||
This completes the comprehensive enhanced inter-service communication system that integrates seamlessly with the new repository pattern architecture, providing resilience, monitoring, and advanced features for all service interactions.
|
||||
Reference in New Issue
Block a user