- Updated all OpenTelemetry packages to latest versions: - opentelemetry-api: 1.27.0 → 1.39.1 - opentelemetry-sdk: 1.27.0 → 1.39.1 - opentelemetry-exporter-otlp-proto-grpc: 1.27.0 → 1.39.1 - opentelemetry-exporter-otlp-proto-http: 1.27.0 → 1.39.1 - opentelemetry-instrumentation-fastapi: 0.48b0 → 0.60b1 - opentelemetry-instrumentation-httpx: 0.48b0 → 0.60b1 - opentelemetry-instrumentation-redis: 0.48b0 → 0.60b1 - opentelemetry-instrumentation-sqlalchemy: 0.48b0 → 0.60b1 - Removed prometheus-client==0.23.1 from all services - Unified all services to use the same monitoring package versions Generated by Mistral Vibe. Co-Authored-By: Mistral Vibe <vibe@mistral.ai>
Bakery IA Kubernetes Configuration
This directory contains Kubernetes manifests for deploying the Bakery IA platform in local development and production environments with HTTPS support using cert-manager and NGINX ingress.
Quick Start
Deploy the entire platform with these 4 commands:
# 1. Start Colima with adequate resources
colima start --cpu 6 --memory 12 --disk 120 --runtime docker --profile k8s-local
# 2. Create Kind cluster with permanent localhost access
kind create cluster --config kind-config.yaml
# 3. Install NGINX Ingress Controller
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/kind/deploy.yaml
kubectl wait --namespace ingress-nginx --for=condition=ready pod --selector=app.kubernetes.io/component=controller --timeout=300s
# 4. Deploy with Tilt
tilt up
# 🎉 Access at: http://localhost (or see Tilt for individual service ports)
Note
: The kind-config.yaml already configures port mappings (30080→80, 30443→443) for localhost access, so no additional service patching is needed. The NGINX Ingress for Kind uses NodePort by default on those exact ports.
Prerequisites
Install the following tools on macOS:
# Install via Homebrew
brew install colima kind kubectl skaffold
# Verify installations
colima version && kind version && kubectl version --client && skaffold version
Directory Structure
infrastructure/kubernetes/
├── base/ # Base Kubernetes resources
│ ├── namespace.yaml # Namespace definition
│ ├── configmap.yaml # Shared configuration
│ ├── secrets.yaml # Base64 encoded secrets
│ ├── ingress-https.yaml # HTTPS ingress rules
│ ├── kustomization.yaml # Base kustomization
│ └── components/ # Individual component manifests
│ ├── cert-manager/ # Certificate management
│ ├── auth/ # Authentication service
│ ├── tenant/ # Tenant management
│ ├── training/ # ML training service
│ ├── forecasting/ # Demand forecasting
│ ├── sales/ # Sales management
│ ├── external/ # External API service
│ ├── notification/ # Notification service
│ ├── inventory/ # Inventory management
│ ├── recipes/ # Recipe management
│ ├── suppliers/ # Supplier management
│ ├── pos/ # Point of sale
│ ├── orders/ # Order management
│ ├── production/ # Production planning
│ ├── alert-processor/ # Alert processing
│ ├── frontend/ # React frontend
│ ├── databases/ # Database deployments
│ └── infrastructure/ # Gateway & monitoring
└── overlays/
└── dev/ # Development environment
├── kustomization.yaml # Dev-specific configuration
└── dev-patches.yaml # Development patches
Access URLs
Primary Access (Standard Web Ports)
- Frontend: https://localhost
- API Gateway: https://localhost/api
Named Host Access (Optional)
Add to /etc/hosts for named access:
echo "127.0.0.1 bakery-ia.local" | sudo tee -a /etc/hosts
echo "127.0.0.1 api.bakery-ia.local" | sudo tee -a /etc/hosts
echo "127.0.0.1 monitoring.bakery-ia.local" | sudo tee -a /etc/hosts
Then access via:
- Frontend: https://bakery-ia.local
- API: https://api.bakery-ia.local
- Monitoring: https://monitoring.bakery-ia.local
Direct Service Access (Development)
- Frontend: http://localhost:3000
- Gateway: http://localhost:8000
Development Workflow
Start Development Environment
# Start development mode with hot-reload using Tilt
tilt up
# Or start in background
tilt up --stream
Key Features
- ✅ Hot-reload development - Automatic rebuilds on code changes
- ✅ Permanent localhost access - No port forwarding needed
- ✅ HTTPS by default - Local CA certificates for secure development
- ✅ Microservices architecture - All services deployed together
- ✅ Database management - PostgreSQL, Redis, and RabbitMQ included
Monitor and Debug
# Check all resources
kubectl get all -n bakery-ia
# View logs
kubectl logs -n bakery-ia deployment/auth-service -f
# Check ingress status
kubectl get ingress -n bakery-ia
# Debug certificate issues
kubectl describe certificate bakery-ia-tls-cert -n bakery-ia
Certificate Management
The platform uses cert-manager for automatic HTTPS certificate generation:
- Local CA: For development (default)
- Let's Encrypt Staging: For testing
- Let's Encrypt Production: For production deployments
Trust Local Certificates
# Export CA certificate
kubectl get secret local-ca-key-pair -n cert-manager -o jsonpath='{.data.tls\.crt}' | base64 -d > bakery-ia-ca.crt
# Trust in macOS
open bakery-ia-ca.crt
# In Keychain Access, set "bakery-ia-local-ca" to "Always Trust"
Configuration Management
Secrets
Base64-encoded secrets are stored in base/secrets.yaml. For production:
- Use external secret management (HashiCorp Vault, AWS Secrets Manager)
- Never commit real secrets to version control
# Encode secrets
echo -n "your-secret-value" | base64
# Decode secrets
echo "eW91ci1zZWNyZXQtdmFsdWU=" | base64 -d
Environment Configuration
Development-specific settings are in overlays/dev/:
- Resource limits: Reduced for local development
- Image pull policy: Never (for local images)
- Debug settings: Enabled
- CORS: Configured for localhost
Scaling and Resource Management
Scale Services
# Scale individual service
kubectl scale -n bakery-ia deployment/auth-service --replicas=3
# Or update kustomization.yaml replicas section
Resource Configuration
Development environment uses minimal resources:
- Databases: 64Mi-256Mi memory, 25m-200m CPU
- Services: 64Mi-256Mi memory, 25m-200m CPU
- Training Service: 256Mi-1Gi memory (ML workloads)
Troubleshooting
Common Issues
-
Images not found
# Build images with Skaffold skaffold build --profile=dev -
Database corruption after restart
# Delete corrupted PVC and restart kubectl delete pod -n bakery-ia -l app.kubernetes.io/name=inventory-db kubectl delete pvc -n bakery-ia inventory-db-pvc -
HTTPS certificate not issued
# Check cert-manager logs kubectl logs -n cert-manager deployment/cert-manager kubectl describe certificate bakery-ia-tls-cert -n bakery-ia -
Port conflicts
# Check what's using ports 80/443 sudo lsof -i :80 -i :443
Debug Commands
# Get cluster events
kubectl get events -n bakery-ia --sort-by='.firstTimestamp'
# Resource usage
kubectl top pods -n bakery-ia
kubectl top nodes
# Execute in pod
kubectl exec -n bakery-ia -it <pod-name> -- bash
Cleanup
Quick Cleanup
# Stop Skaffold (Ctrl+C or)
skaffold delete --profile=dev
Complete Cleanup
# Delete everything
kubectl delete namespace bakery-ia
kind delete cluster --name bakery-ia-local
colima stop --profile k8s-local
Restart Sequence
# Post-restart startup (or use kubernetes_restart.sh script)
colima start --cpu 6 --memory 12 --disk 120 --runtime docker --profile k8s-local
kind create cluster --config kind-config.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/kind/deploy.yaml
kubectl wait --namespace ingress-nginx --for=condition=ready pod --selector=app.kubernetes.io/component=controller --timeout=300s
tilt up
Production Deployment
Production URLs
The production environment uses the following domains:
-
Main Application: https://bakewise.ai
- Frontend application and all public pages
- API endpoints: https://bakewise.ai/api/v1/...
-
Monitoring Stack: https://monitoring.bakewise.ai
- Grafana: https://monitoring.bakewise.ai/grafana
- Prometheus: https://monitoring.bakewise.ai/prometheus
- Jaeger: https://monitoring.bakewise.ai/jaeger
- AlertManager: https://monitoring.bakewise.ai/alertmanager
Production Configuration
The production overlay (overlays/prod/) includes:
- Domain Configuration: bakewise.ai with Let's Encrypt certificates
- High Availability: Multi-replica deployments (2-3 replicas per service)
- Enhanced Security: Rate limiting, CORS restrictions, security headers
- Monitoring: Full observability stack with Prometheus, Grafana, Jaeger
Production Considerations
For production deployment:
- Security: Implement RBAC, network policies, pod security standards
- Monitoring: Deploy Prometheus, Grafana, and alerting
- Backup: Database backup strategies
- High Availability: Multi-replica deployments with anti-affinity
- External Secrets: Use managed secret services
- TLS: Production Let's Encrypt certificates
- CI/CD: Automated deployment pipelines
- DNS: Configure DNS A/CNAME records pointing to your cluster's load balancer
Next Steps
- Add comprehensive monitoring and logging
- Implement automated testing
- Set up CI/CD pipelines
- Add health checks and metrics endpoints
- Implement proper backup strategies