API Gateway Service
Overview
The API Gateway serves as the centralized entry point for all client requests to the Bakery-IA platform. It provides a unified interface for 18+ microservices, handling authentication, rate limiting, request routing, and real-time event streaming. This service is critical for security, performance, and operational visibility across the entire system.
Key Features
Core Capabilities
- Centralized API Routing - Single entry point for all microservice endpoints, simplifying client integration
- JWT Authentication & Authorization - Token-based security with cached validation for performance
- Rate Limiting - 300 requests per minute per client to prevent abuse and ensure fair resource allocation
- Request ID Tracing - Distributed tracing with unique request IDs for debugging and observability
- Demo Mode Support - Special handling for demo accounts with isolated environments
- Subscription Management - Validates tenant subscription status before allowing operations
- Read-Only Mode Enforcement - Tenant-level write protection for billing or administrative purposes
- CORS Handling - Configurable cross-origin resource sharing for web clients
Real-Time Communication
- Server-Sent Events (SSE) - Real-time alert streaming to frontend dashboards
- WebSocket Proxy - Bidirectional communication for ML training progress updates
- Redis Pub/Sub Integration - Event broadcasting for multi-instance deployments
Observability & Monitoring
- Comprehensive Logging - Structured JSON logging with request/response details
- Prometheus Metrics - Request counters, duration histograms, error rates
- Health Check Aggregation - Monitors health of all downstream services
- Performance Tracking - Per-route performance metrics
External Integrations
- Nominatim Geocoding Proxy - OpenStreetMap geocoding for address validation
- Multi-Channel Notification Routing - Routes alerts to email, WhatsApp, and SSE channels
Technical Capabilities
Authentication Flow
- JWT Token Validation - Verifies access tokens with cached public key
- Token Refresh - Automatic refresh token handling
- User Context Injection - Attaches user and tenant information to requests
- Demo Account Detection - Identifies and isolates demo sessions
- Subscription Data Extraction - Extracts subscription tier from JWT payload (eliminates per-request HTTP calls)
Request Processing Pipeline
Client Request
↓
CORS Middleware
↓
Request ID Generation
↓
Logging Middleware (Pre-processing)
↓
Rate Limiting Check
↓
Authentication Middleware
↓
Subscription Validation
↓
Read-Only Mode Check
↓
Service Router (Proxy to Microservice)
↓
Response Logging (Post-processing)
↓
Client Response
Caching Strategy
- Token Validation Cache - 15-minute TTL for validated tokens (Redis)
- User Information Cache - Reduces auth service calls
- Health Check Cache - 30-second TTL for service health status
Real-Time Event Streaming
- SSE Connection Management - Persistent connections for alert streaming
- Redis Pub/Sub - Scales SSE across multiple gateway instances
- Tenant-Isolated Channels - Each tenant receives only their alerts
- Reconnection Support - Clients can resume streams after disconnection
Business Value
For Bakery Owners
- Single API Endpoint - Simplifies integration with POS systems and external tools
- Real-Time Alerts - Instant notifications for low stock, quality issues, and production problems
- Secure Access - Enterprise-grade security protects sensitive business data
- Reliable Performance - Rate limiting and caching ensure consistent response times
- Faster Response Times - JWT-embedded subscription data eliminates 520ms overhead per request
Performance Impact
Before JWT Subscription Embedding:
- 5 synchronous HTTP calls per request to tenant-service
- 2,500ms notification endpoint latency
- 5,500ms subscription endpoint latency
- ~520ms overhead on EVERY tenant-scoped request
After JWT Subscription Embedding:
- Zero HTTP calls for subscription validation
- <1ms subscription check latency (JWT extraction only)
- ~200ms notification endpoint latency (92% improvement)
- ~100ms subscription endpoint latency (98% improvement)
- 100% reduction in tenant-service load for subscription checks
For Platform Operations
- Cost Efficiency - Caching reduces backend load by 60-70%
- Scalability - Horizontal scaling with stateless design
- Security - Centralized authentication reduces attack surface
- Observability - Complete request tracing for debugging and optimization
For Developers
- Simplified Integration - Single endpoint instead of 18+ service URLs
- Consistent Error Handling - Standardized error responses across all services
- API Documentation - Centralized OpenAPI/Swagger documentation
- Request Tracing - Easy debugging with request ID correlation
Technology Stack
- Framework: FastAPI (Python 3.11+) - Async web framework with automatic OpenAPI docs
- HTTP Client: HTTPx - Async HTTP client for service-to-service communication
- Caching: Redis 7.4 - Token cache, SSE pub/sub, rate limiting, token freshness tracking
- Logging: Structlog - Structured JSON logging for observability
- Metrics: Prometheus Client - Custom metrics for monitoring
- Authentication: JWT (JSON Web Tokens) - Token-based authentication with embedded subscription data
- WebSockets: FastAPI WebSocket support - Real-time training updates
JWT Subscription Architecture
Overview
The gateway implements a JWT-embedded subscription data architecture that eliminates runtime HTTP calls to the tenant-service for subscription validation. This provides significant performance improvements while maintaining security.
JWT Payload Structure
{
"user_id": "uuid",
"email": "user@example.com",
"tenant_id": "uuid",
"tenant_role": "owner",
"subscription": {
"tier": "professional",
"status": "active",
"valid_until": "2025-12-31T23:59:59Z"
},
"tenant_access": [
{"id": "tenant-uuid", "role": "admin", "tier": "starter"}
],
"exp": 1735689599,
"iat": 1735687799,
"iss": "bakery-auth"
}
Security Layers
The architecture implements defense-in-depth with multiple validation layers:
- Layer 1: JWT Signature Verification - Gateway validates JWT signature
- Layer 2: Subscription Data Extraction - Extracts subscription from verified JWT
- Layer 3: Token Freshness Check - Detects stale tokens after subscription changes
- Layer 4: Database Verification - For critical operations (optional)
- Layer 5: Audit Logging - Comprehensive logging for anomaly detection
Token Freshness Mechanism
- When subscription changes, gateway sets
tenant:{tenant_id}:subscription_changed_atin Redis - Gateway checks if token was issued before subscription change
- Stale tokens are rejected, forcing re-authentication
- Ensures users get fresh subscription data within token expiry window (15-30 min)
Multi-Tenant Support
- JWT contains
tenant_accessarray with all accessible tenants - Each tenant entry includes role and subscription tier
- Gateway validates access to requested tenant
- Supports hierarchical tenant access patterns
API Endpoints (Key Routes)
Authentication Routes
POST /api/v1/auth/login- User login (returns access + refresh tokens)POST /api/v1/auth/register- User registrationPOST /api/v1/auth/refresh- Refresh access tokenPOST /api/v1/auth/logout- User logout
Service Proxies (Protected Routes)
All routes under /api/v1/ are protected by JWT authentication:
/api/v1/sales/**→ Sales Service/api/v1/forecasting/**→ Forecasting Service/api/v1/training/**→ Training Service/api/v1/inventory/**→ Inventory Service/api/v1/production/**→ Production Service/api/v1/recipes/**→ Recipes Service/api/v1/orders/**→ Orders Service/api/v1/suppliers/**→ Suppliers Service/api/v1/procurement/**→ Procurement Service/api/v1/pos/**→ POS Service/api/v1/external/**→ External Service/api/v1/notifications/**→ Notification Service/api/v1/ai-insights/**→ AI Insights Service/api/v1/orchestrator/**→ Orchestrator Service/api/v1/tenants/**→ Tenant Service
Real-Time Routes
GET /api/v1/alerts/stream- SSE alert stream (requires authentication)WS /api/v1/training/ws- WebSocket for training progress
Utility Routes
GET /health- Gateway health checkGET /api/v1/health- All services health statusPOST /api/v1/geocode- Nominatim geocoding proxy
Middleware Components
1. CORS Middleware
- Configurable allowed origins
- Credentials support
- Pre-flight request handling
2. Request ID Middleware
- Generates unique UUIDs for each request
- Propagates request IDs to downstream services
- Included in all log messages
3. Logging Middleware
- Pre-request logging (method, path, headers)
- Post-request logging (status code, duration)
- Error logging with stack traces
4. Authentication Middleware
- JWT token extraction from
Authorizationheader - Token validation with cached results
- User/tenant context injection
- Demo account detection
- Subscription tier extraction from JWT - Eliminates 5 synchronous HTTP calls per request to tenant-service
- Token freshness verification - Detects stale tokens after subscription changes
5. Rate Limiting Middleware
- Token bucket algorithm
- 300 requests per minute per IP/user
- 429 Too Many Requests response on limit exceeded
6. Subscription Middleware
- JWT-based subscription validation - Uses subscription data embedded in JWT tokens
- Zero HTTP calls for subscription checks - Subscription tier extracted from verified JWT
- Checks subscription expiry
- Allows grace period for expired subscriptions
- Defense-in-depth verification - Database verification for critical operations
7. Read-Only Middleware
- Enforces tenant-level write restrictions
- Blocks POST/PUT/PATCH/DELETE when read-only mode enabled
- Used for billing holds or maintenance
Metrics & Monitoring
Custom Prometheus Metrics
Request Metrics:
gateway_requests_total- Counter (method, path, status_code)gateway_request_duration_seconds- Histogram (method, path)gateway_request_size_bytes- Histogramgateway_response_size_bytes- Histogram
Authentication Metrics:
gateway_auth_attempts_total- Counter (status: success/failure)gateway_auth_cache_hits_total- Countergateway_auth_cache_misses_total- Counter
Rate Limiting Metrics:
gateway_rate_limit_exceeded_total- Counter (endpoint)
Service Health Metrics:
gateway_service_health- Gauge (service_name, status: healthy/unhealthy)
Health Check Endpoint
GET /health returns:
{
"status": "healthy",
"version": "1.0.0",
"services": {
"auth": "healthy",
"sales": "healthy",
"forecasting": "healthy",
...
},
"redis": "connected",
"timestamp": "2025-11-06T10:30:00Z"
}
Configuration
Environment Variables
Service Configuration:
PORT- Gateway listening port (default: 8000)HOST- Gateway bind address (default: 0.0.0.0)ENVIRONMENT- Environment name (dev/staging/prod)LOG_LEVEL- Logging level (DEBUG/INFO/WARNING/ERROR)
Service URLs:
AUTH_SERVICE_URL- Auth service internal URLSALES_SERVICE_URL- Sales service internal URLFORECASTING_SERVICE_URL- Forecasting service internal URLTRAINING_SERVICE_URL- Training service internal URLINVENTORY_SERVICE_URL- Inventory service internal URLPRODUCTION_SERVICE_URL- Production service internal URLRECIPES_SERVICE_URL- Recipes service internal URLORDERS_SERVICE_URL- Orders service internal URLSUPPLIERS_SERVICE_URL- Suppliers service internal URLPROCUREMENT_SERVICE_URL- Procurement service internal URLPOS_SERVICE_URL- POS service internal URLEXTERNAL_SERVICE_URL- External service internal URLNOTIFICATION_SERVICE_URL- Notification service internal URLAI_INSIGHTS_SERVICE_URL- AI Insights service internal URLORCHESTRATOR_SERVICE_URL- Orchestrator service internal URLTENANT_SERVICE_URL- Tenant service internal URL
Redis Configuration:
REDIS_HOST- Redis server hostREDIS_PORT- Redis server port (default: 6379)REDIS_DB- Redis database number (default: 0)REDIS_PASSWORD- Redis authentication password (optional)
Security Configuration:
JWT_PUBLIC_KEY- RSA public key for JWT verificationJWT_ALGORITHM- JWT algorithm (default: RS256)RATE_LIMIT_REQUESTS- Max requests per window (default: 300)RATE_LIMIT_WINDOW_SECONDS- Rate limit window (default: 60)
CORS Configuration:
CORS_ORIGINS- Comma-separated allowed originsCORS_ALLOW_CREDENTIALS- Allow credentials (default: true)
Events & Messaging
Consumed Events (Redis Pub/Sub)
- Channel:
alerts:tenant:{tenant_id}- Event: Alert notifications for SSE streaming
- Format: JSON with alert_id, severity, message, timestamp
Published Events
The gateway does not publish events directly but forwards events from downstream services.
Development Setup
Prerequisites
- Python 3.11+
- Redis 7.4+
- Access to all microservices (locally or via network)
Local Development
# Install dependencies
cd gateway
pip install -r requirements.txt
# Set environment variables
export AUTH_SERVICE_URL=http://localhost:8001
export SALES_SERVICE_URL=http://localhost:8002
export REDIS_HOST=localhost
export JWT_PUBLIC_KEY="$(cat ../keys/jwt_public.pem)"
# Run the gateway
python main.py
Docker Development
# Build image
docker build -t bakery-ia-gateway .
# Run container
docker run -p 8000:8000 \
-e AUTH_SERVICE_URL=http://auth:8001 \
-e REDIS_HOST=redis \
bakery-ia-gateway
Testing
# Unit tests
pytest tests/unit/
# Integration tests
pytest tests/integration/
# Load testing
locust -f tests/load/locustfile.py
Integration Points
Dependencies (Services Called)
- Auth Service - User authentication and token validation
- All Microservices - Proxies requests to 18+ downstream services
- Redis - Caching, rate limiting, SSE pub/sub
- Nominatim - External geocoding service
Dependents (Services That Call This)
- Frontend Dashboard - All API calls go through the gateway
- Mobile Apps (future) - Will use gateway as single endpoint
- External Integrations - Third-party systems use gateway API
- Monitoring Tools - Prometheus scrapes
/metricsendpoint
Security Measures
Authentication & Authorization
- JWT Token Validation - RSA-based signature verification
- Token Expiry Checks - Rejects expired tokens
- Refresh Token Rotation - Secure token refresh flow
- Demo Account Isolation - Separate demo environments
Attack Prevention
- Rate Limiting - Prevents brute force and DDoS attacks
- Input Validation - Pydantic schema validation on all inputs
- CORS Restrictions - Only allowed origins can access API
- Request Size Limits - Prevents payload-based attacks
- SQL Injection Prevention - All downstream services use parameterized queries
- XSS Prevention - Response sanitization
Data Protection
- HTTPS Only (Production) - Encrypted in transit
- Tenant Isolation - Requests scoped to authenticated tenant
- Read-Only Mode - Prevents unauthorized data modifications
- Audit Logging - All requests logged for security audits
Performance Optimization
Caching Strategy
- Token Validation Cache - 95%+ cache hit rate reduces auth service load
- User Info Cache - Reduces database queries by 80%
- Service Health Cache - Prevents health check storms
Connection Pooling
- HTTPx Connection Pool - Reuses HTTP connections to services
- Redis Connection Pool - Efficient Redis connection management
Async I/O
- FastAPI Async - Non-blocking request handling
- Concurrent Service Calls - Multiple microservice requests in parallel
- Async Middleware - Non-blocking middleware chain
Compliance & Standards
GDPR Compliance
- Request Logging - Can be anonymized or deleted per user request
- Data Minimization - Only essential data logged
- Right to Access - Logs can be exported for data subject access requests
API Standards
- RESTful API Design - Standard HTTP methods and status codes
- OpenAPI 3.0 - Automatic API documentation via FastAPI
- JSON API - Consistent JSON request/response format
- Error Handling - RFC 7807 Problem Details for HTTP APIs
Observability Standards
- Structured Logging - JSON logs with consistent schema
- Distributed Tracing - Request ID propagation
- Prometheus Metrics - Industry-standard metrics format
Scalability
Horizontal Scaling
- Stateless Design - No local state, scales horizontally
- Load Balancing - Kubernetes service load balancing
- Redis Shared State - Shared cache and pub/sub across instances
Performance Characteristics
- Throughput: 1,000+ requests/second per instance
- Latency: <10ms median (excluding downstream service time)
- Concurrent Connections: 10,000+ with async I/O
- SSE Connections: 1,000+ per instance
Troubleshooting
Common Issues
Issue: 401 Unauthorized responses
- Cause: Invalid or expired JWT token
- Solution: Refresh token or re-login
Issue: 429 Too Many Requests
- Cause: Rate limit exceeded
- Solution: Wait 60 seconds or optimize request patterns
Issue: 503 Service Unavailable
- Cause: Downstream service is down
- Solution: Check service health endpoint, restart affected service
Issue: SSE connection drops
- Cause: Network timeout or gateway restart
- Solution: Implement client-side reconnection logic
Debug Mode
Enable detailed logging:
export LOG_LEVEL=DEBUG
export STRUCTLOG_PRETTY_PRINT=true
Competitive Advantages
- Single Entry Point - Simplifies integration compared to direct microservice access
- Built-in Security - Enterprise-grade authentication and rate limiting
- Real-Time Capabilities - SSE and WebSocket support for live updates
- Observable - Complete request tracing and metrics out-of-the-box
- Scalable - Stateless design allows unlimited horizontal scaling
- Multi-Tenant Ready - Tenant isolation at the gateway level
Future Enhancements
- GraphQL Support - Alternative query interface alongside REST
- API Versioning - Support multiple API versions simultaneously
- Request Transformation - Protocol translation (REST to gRPC)
- Advanced Rate Limiting - Per-tenant, per-endpoint limits
- API Key Management - Alternative authentication for M2M integrations
- Circuit Breaker - Automatic service failure handling
- Request Replay - Debugging tool for request replay
For VUE Madrid Business Plan: The API Gateway demonstrates enterprise-grade architecture with scalability, security, and observability built-in from day one. This infrastructure supports thousands of concurrent bakery clients with consistent performance and reliability, making Bakery-IA a production-ready SaaS platform for the Spanish bakery market.