diff --git a/HEALTH_CHECKS.md b/HEALTH_CHECKS.md deleted file mode 100644 index 60f9a946..00000000 --- a/HEALTH_CHECKS.md +++ /dev/null @@ -1,281 +0,0 @@ -# Unified Health Check System - -This document describes the unified health check system implemented across all microservices in the bakery-ia platform. - -## Overview - -The unified health check system provides standardized health monitoring endpoints across all services, with comprehensive database verification, Kubernetes integration, and detailed health reporting. - -## Key Features - -- **Standardized Endpoints**: All services now provide the same health check endpoints -- **Database Verification**: Comprehensive database health checks including table existence verification -- **Kubernetes Integration**: Proper separation of liveness and readiness probes -- **Detailed Reporting**: Rich health status information for debugging and monitoring -- **App State Integration**: Health checks automatically detect service ready state - -## Health Check Endpoints - -### `/health` - Basic Health Check -- **Purpose**: Basic service health status -- **Use Case**: General health monitoring, API gateways -- **Response**: Service name, version, status, and timestamp -- **Status Codes**: 200 (healthy/starting) - -### `/health/ready` - Kubernetes Readiness Probe -- **Purpose**: Indicates if service is ready to receive traffic -- **Use Case**: Kubernetes readiness probe, load balancer health checks -- **Checks**: Application state, database connectivity, table verification, custom checks -- **Status Codes**: 200 (ready), 503 (not ready) - -### `/health/live` - Kubernetes Liveness Probe -- **Purpose**: Indicates if service is alive and should not be restarted -- **Use Case**: Kubernetes liveness probe -- **Response**: Simple alive status -- **Status Codes**: 200 (alive) - -### `/health/database` - Detailed Database Health -- **Purpose**: Comprehensive database health information for debugging -- **Use Case**: Database monitoring, troubleshooting -- **Checks**: Connectivity, table existence, connection pool status, response times -- **Status Codes**: 200 (healthy), 503 (unhealthy) - -## Implementation - -### Services Updated - -The following services have been updated to use the unified health check system: - -1. **Training Service** (`training-service`) - - Full implementation with database manager integration - - Table verification for ML training tables - - Expected tables: `model_training_logs`, `trained_models`, `model_performance_metrics`, `training_job_queue`, `model_artifacts` - -2. **Orders Service** (`orders-service`) - - Legacy database integration with custom health checks - - Expected tables: `customers`, `customer_contacts`, `customer_orders`, `order_items`, `order_status_history`, `procurement_plans`, `procurement_requirements` - -3. **Inventory Service** (`inventory-service`) - - Full database manager integration - - Food safety and inventory table verification - - Expected tables: `ingredients`, `stock`, `stock_movements`, `product_transformations`, `stock_alerts`, `food_safety_compliance`, `temperature_logs`, `food_safety_alerts` - -### Code Integration - -#### Basic Setup -```python -from shared.monitoring.health_checks import setup_fastapi_health_checks - -# Setup unified health checks -health_manager = setup_fastapi_health_checks( - app=app, - service_name="my-service", - version="1.0.0", - database_manager=database_manager, - expected_tables=['table1', 'table2'], - custom_checks={"custom_check": custom_check_function} -) -``` - -#### With Custom Checks -```python -async def custom_health_check(): - """Custom health check function""" - return await some_service_check() - -health_manager = setup_fastapi_health_checks( - app=app, - service_name="my-service", - version="1.0.0", - database_manager=database_manager, - expected_tables=['table1', 'table2'], - custom_checks={"external_service": custom_health_check} -) -``` - -#### Service Ready State -```python -# In your lifespan function -async def lifespan(app: FastAPI): - # Startup logic - await initialize_service() - - # Mark service as ready - app.state.ready = True - - yield - - # Shutdown logic -``` - -## Kubernetes Configuration - -### Updated Probe Configuration - -The microservice template and specific service configurations have been updated to use the new endpoints: - -```yaml -livenessProbe: - httpGet: - path: /health/live - port: 8000 - initialDelaySeconds: 30 - timeoutSeconds: 5 - periodSeconds: 10 - failureThreshold: 3 - -readinessProbe: - httpGet: - path: /health/ready - port: 8000 - initialDelaySeconds: 15 - timeoutSeconds: 3 - periodSeconds: 5 - failureThreshold: 5 -``` - -### Key Changes from Previous Configuration - -1. **Liveness Probe**: Now uses `/health/live` instead of `/health` -2. **Readiness Probe**: Now uses `/health/ready` instead of `/health` -3. **Improved Timing**: Adjusted timeouts and failure thresholds for better reliability -4. **Separate Concerns**: Liveness and readiness are now properly separated - -## Health Check Response Examples - -### Basic Health Check Response -```json -{ - "status": "healthy", - "service": "training-service", - "version": "1.0.0", - "timestamp": "2025-01-27T10:30:00Z" -} -``` - -### Readiness Check Response (Ready) -```json -{ - "status": "ready", - "checks": { - "application": true, - "database_connectivity": true, - "database_tables": true - }, - "database": { - "status": "healthy", - "tables_verified": ["model_training_logs", "trained_models"], - "missing_tables": [], - "errors": [] - } -} -``` - -### Database Health Response -```json -{ - "status": "healthy", - "connectivity": true, - "tables_exist": true, - "tables_verified": ["model_training_logs", "trained_models"], - "missing_tables": [], - "errors": [], - "connection_info": { - "service_name": "training-service", - "database_type": "postgresql", - "pool_size": 20, - "current_checked_out": 2 - }, - "response_time_ms": 15.23 -} -``` - -## Testing - -### Manual Testing -```bash -# Test all endpoints for a running service -curl http://localhost:8000/health -curl http://localhost:8000/health/ready -curl http://localhost:8000/health/live -curl http://localhost:8000/health/database -``` - -### Automated Testing -Use the provided test script: -```bash -python test_unified_health_checks.py -``` - -## Migration Guide - -### For Existing Services - -1. **Add Health Check Import**: - ```python - from shared.monitoring.health_checks import setup_fastapi_health_checks - ``` - -2. **Add Database Manager Import** (if using shared database): - ```python - from app.core.database import database_manager - ``` - -3. **Setup Health Checks** (after app creation, before router inclusion): - ```python - health_manager = setup_fastapi_health_checks( - app=app, - service_name="your-service-name", - version=settings.VERSION, - database_manager=database_manager, - expected_tables=["table1", "table2"] - ) - ``` - -4. **Remove Old Health Endpoints**: - Remove any existing `@app.get("/health")` endpoints - -5. **Add Ready State Management**: - ```python - # In lifespan function after successful startup - app.state.ready = True - ``` - -6. **Update Kubernetes Configuration**: - Update deployment YAML to use new probe endpoints - -### For Services Using Legacy Database - -If your service doesn't use the shared database manager: - -```python -async def legacy_database_check(): - """Custom health check for legacy database""" - return await your_db_health_check() - -health_manager = setup_fastapi_health_checks( - app=app, - service_name="your-service", - version=settings.VERSION, - database_manager=None, - expected_tables=None, - custom_checks={"legacy_database": legacy_database_check} -) -``` - -## Benefits - -1. **Consistency**: All services now provide the same health check interface -2. **Better Kubernetes Integration**: Proper separation of liveness and readiness concerns -3. **Enhanced Debugging**: Detailed health information for troubleshooting -4. **Database Verification**: Comprehensive database health checks including table verification -5. **Monitoring Ready**: Rich health status information for monitoring systems -6. **Maintainability**: Centralized health check logic reduces code duplication - -## Future Enhancements - -1. **Metrics Integration**: Add Prometheus metrics for health check performance -2. **Circuit Breaker**: Implement circuit breaker pattern for external service checks -3. **Health Check Dependencies**: Add dependency health checks between services -4. **Performance Thresholds**: Add configurable performance thresholds for health checks -5. **Health Check Scheduling**: Add scheduled background health checks \ No newline at end of file diff --git a/scripts/seed_orders_test_data.sh b/scripts/seed_orders_test_data.sh deleted file mode 100755 index f6058a46..00000000 --- a/scripts/seed_orders_test_data.sh +++ /dev/null @@ -1,27 +0,0 @@ -#!/bin/bash - -# Script to seed the orders database with test data -set -e - -echo "🌱 Seeding Orders Database with Test Data" -echo "=========================================" - -# Change to the orders service directory -cd services/orders - -# Make sure we're in a virtual environment or have the dependencies -echo "📦 Setting up environment..." - -# Run the seeding script -echo "🚀 Running seeding script..." -python scripts/seed_test_data.py - -echo "✅ Database seeding completed!" -echo "" -echo "🎯 Test data created:" -echo " - 6 customers (including VIP, wholesale, and inactive)" -echo " - 25 orders in various statuses" -echo " - Order items with different products" -echo " - Order status history" -echo "" -echo "📋 You can now test the frontend with real data!" \ No newline at end of file diff --git a/test_alert_quick.sh b/tests/test_alert_quick.sh similarity index 100% rename from test_alert_quick.sh rename to tests/test_alert_quick.sh diff --git a/test_alert_working.py b/tests/test_alert_working.py similarity index 100% rename from test_alert_working.py rename to tests/test_alert_working.py diff --git a/test_out_of_stock_alert.py b/tests/test_out_of_stock_alert.py similarity index 100% rename from test_out_of_stock_alert.py rename to tests/test_out_of_stock_alert.py diff --git a/test_recommendation_alert.py b/tests/test_recommendation_alert.py similarity index 100% rename from test_recommendation_alert.py rename to tests/test_recommendation_alert.py