# Bakery-IA Pilot Launch Guide **Complete guide for deploying to production for a 10-tenant pilot program** **Last Updated:** 2026-01-11 **Target Environment:** clouding.io VPS with MicroK8s **Estimated Cost:** €41-81/month **Time to Deploy:** 3-5 hours (first time, including fixes) **Status:** ⚠️ REQUIRES PRE-DEPLOYMENT FIXES - See [Production VPS Deployment Fixes](../PRODUCTION_VPS_DEPLOYMENT_FIXES.md) --- ## Table of Contents 1. [Executive Summary](#executive-summary) 2. [⚠️ CRITICAL: Pre-Deployment Fixes](#critical-pre-deployment-fixes) 3. [Pre-Launch Checklist](#pre-launch-checklist) 4. [VPS Provisioning](#vps-provisioning) 5. [Infrastructure Setup](#infrastructure-setup) 6. [Domain & DNS Configuration](#domain--dns-configuration) 7. [TLS/SSL Certificates](#tlsssl-certificates) 8. [Email & Communication Setup](#email--communication-setup) 9. [Kubernetes Deployment](#kubernetes-deployment) 10. [Configuration & Secrets](#configuration--secrets) 11. [Database Migrations](#database-migrations) 12. [Verification & Testing](#verification--testing) 13. [Post-Deployment](#post-deployment) --- ## Executive Summary ### What You're Deploying A complete multi-tenant SaaS platform with: - **18 microservices** (auth, tenant, ML forecasting, inventory, sales, orders, etc.) - **14 PostgreSQL databases** with TLS encryption - **Redis cache** with TLS - **RabbitMQ** message broker - **Monitoring stack** (Prometheus, Grafana, AlertManager) - **Full security** (TLS, RBAC, audit logging) ### Total Cost Breakdown | Service | Provider | Monthly Cost | |---------|----------|-------------| | VPS Server (20GB RAM, 8 vCPU, 200GB SSD) | clouding.io | €40-80 | | Domain | Namecheap/Cloudflare | €1.25 (€15/year) | | Email | Zoho Free / Gmail | €0 | | WhatsApp API | Meta Business | €0 (1k free conversations) | | DNS | Cloudflare | €0 | | SSL | Let's Encrypt | €0 | | **TOTAL** | | **€41-81/month** | ### Timeline | Phase | Duration | Description | |-------|----------|-------------| | Pre-Launch Setup | 1-2 hours | Domain, VPS provisioning, accounts setup | | Infrastructure Setup | 1 hour | MicroK8s installation, firewall config | | Deployment | 30-60 min | Deploy all services and databases | | Verification | 30-60 min | Test everything works | | **Total** | **2-4 hours** | First-time deployment | --- ## ⚠️ CRITICAL: Pre-Deployment Configuration **READ THIS FIRST:** The Kubernetes configuration requires updates for secure production deployment. ### 🔴 Configuration Status Your manifests need the following updates before deploying to production: ### Required Configuration Changes #### 1. Remove imagePullSecrets (BLOCKING) **Why:** Images are public/don't require authentication **Impact if skipped:** All pods fail with ImagePullBackOff #### 2. Update Image Tags to Semantic Versions (BLOCKING) **Why:** Using 'latest' causes non-deterministic deployments **Impact if skipped:** Unpredictable behavior, impossible rollbacks #### 3. Fix SigNoz Namespace References (BLOCKING) - ✅ **ALREADY FIXED** **Why:** SigNoz must be in bakery-ia namespace **Impact if skipped:** Kustomize apply fails **Status:** ✅ Fixed in latest commit #### 4. Generate Production Secrets (HIGH PRIORITY) **Why:** Default secrets are placeholders and insecure **Impact if skipped:** CRITICAL security vulnerability #### 5. Update Cert-Manager Email (HIGH PRIORITY) **Why:** Receive Let's Encrypt renewal notifications **Impact if skipped:** Won't receive SSL expiry warnings ### ✅ Already Correct (No Changes Needed) - **Storage Class** - `microk8s-hostpath` is correct for MicroK8s - **Domain Names** - `bakewise.ai` is your production domain - **Service Types** - ClusterIP + Ingress is correct architecture - **Network Policies** - Not required for single-namespace deployment - **SigNoz Namespace** - ✅ Fixed to use bakery-ia namespace ### Step-by-Step Configuration Script Run these commands on your **local machine** before deployment: ```bash # Navigate to repository root cd /path/to/bakery-ia # ======================================== # STEP 1: Remove imagePullSecrets # ======================================== echo "Step 1: Removing imagePullSecrets..." chmod +x infrastructure/kubernetes/remove-imagepullsecrets.sh ./infrastructure/kubernetes/remove-imagepullsecrets.sh # Verify removal grep -r "imagePullSecrets" infrastructure/kubernetes/base/ && \ echo "⚠️ WARNING: Some files still have imagePullSecrets" || \ echo "✅ imagePullSecrets removed" # ======================================== # STEP 2: Update Image Tags # ======================================== echo -e "\nStep 2: Updating image tags..." export VERSION="1.0.0" # Change this to your version sed -i.bak "s/newTag: latest/newTag: v${VERSION}/g" infrastructure/kubernetes/overlays/prod/kustomization.yaml # Verify no 'latest' tags remain grep "newTag:" infrastructure/kubernetes/overlays/prod/kustomization.yaml | grep "latest" && \ echo "⚠️ WARNING: Some images still use 'latest'" || \ echo "✅ All images now use version v${VERSION}" # ======================================== # STEP 3: Generate Production Secrets # ======================================== echo -e "\nStep 3: Generating production secrets..." echo "Copy these values to infrastructure/kubernetes/base/secrets.yaml" echo "================================================================" # JWT and API secrets echo -e "\n### JWT and API Keys ###" export JWT_SECRET=$(openssl rand -base64 32) export JWT_REFRESH_SECRET=$(openssl rand -base64 32) export SERVICE_API_KEY=$(openssl rand -hex 32) echo "JWT_SECRET_KEY: $(echo -n $JWT_SECRET | base64)" echo "JWT_REFRESH_SECRET_KEY: $(echo -n $JWT_REFRESH_SECRET | base64)" echo "SERVICE_API_KEY: $(echo -n $SERVICE_API_KEY | base64)" # Database passwords echo -e "\n### Database Passwords ###" for db in auth tenant inventory sales orders procurement forecasting analytics notification monitoring users products recipes stock menu demo_session orchestrator cleanup; do password=$(openssl rand -base64 24) echo "${db^^}_DB_PASSWORD: $(echo -n $password | base64)" done echo -e "\n================================================================" echo "⚠️ SAVE THESE SECRETS SECURELY!" echo "Update infrastructure/kubernetes/base/secrets.yaml with the values above" echo "Press Enter when you've updated secrets.yaml..." read # ======================================== # STEP 4: Update Cert-Manager Email # ======================================== echo -e "\nStep 4: Updating cert-manager email..." sed -i.bak 's/admin@bakery-ia.local/admin@bakewise.ai/g' \ infrastructure/kubernetes/base/components/cert-manager/cluster-issuer-production.yaml grep "admin@bakewise.ai" infrastructure/kubernetes/base/components/cert-manager/cluster-issuer-production.yaml && \ echo "✅ Cert-manager email updated" || \ echo "⚠️ WARNING: Email not updated" # ======================================== # FINAL VALIDATION # ======================================== echo -e "\n========================================" echo "Pre-Deployment Configuration Complete!" echo "========================================" echo "" echo "Validation Checklist:" echo " ✅ imagePullSecrets removed" echo " ✅ Image tags updated to v${VERSION}" echo " ✅ SigNoz namespace fixed (bakery-ia)" echo " ⚠️ Production secrets updated in secrets.yaml (manual verification required)" echo " ✅ Cert-manager email updated" echo "" echo "Next: Copy manifests to VPS and begin deployment" ``` ### Manual Verification After running the script above: 1. **Verify secrets.yaml updated:** ```bash # Check that JWT_SECRET_KEY is not the placeholder grep "JWT_SECRET_KEY" infrastructure/kubernetes/base/secrets.yaml # Should NOT show the old placeholder value ``` 2. **Check image tags:** ```bash grep "newTag:" infrastructure/kubernetes/overlays/prod/kustomization.yaml # All should show v1.0.0 (or your version), NOT 'latest' ``` 3. **Verify SigNoz namespace:** ```bash grep -A 3 "name: signoz" infrastructure/kubernetes/overlays/prod/kustomization.yaml # All should show: namespace: bakery-ia ``` **⏱️ Estimated Time:** 30-45 minutes --- ## Pre-Launch Checklist ### Required Accounts & Services - [ ] **Domain Name** - Register at Namecheap or Cloudflare (€10-15/year) - Suggested: `bakeryforecast.es` or `bakery-ia.com` - [ ] **VPS Account** - Sign up at [clouding.io](https://www.clouding.io) - Payment method configured - [ ] **Email Service** (Choose ONE) - Option A: Zoho Mail FREE (recommended for full send/receive) - Option B: Gmail SMTP + domain forwarding - Option C: Google Workspace (14-day free trial, then €5.75/month) - [ ] **WhatsApp Business API** - Create Meta Business Account (free) - Verify business identity - Phone number ready (non-VoIP) - [ ] **DNS Access** - Cloudflare account (free, recommended) - Or domain registrar DNS panel access - [ ] **Container Registry** (Choose ONE) - Option A: Docker Hub account (recommended) - Option B: GitHub Container Registry - Option C: MicroK8s built-in registry ### Required Tools on Local Machine ```bash # Verify you have these installed: kubectl version --client docker --version git --version ssh -V openssl version # Install if missing (macOS): brew install kubectl docker git openssh openssl ``` ### Repository Setup ```bash # Clone the repository git clone https://github.com/yourusername/bakery-ia.git cd bakery-ia # Verify structure ls infrastructure/kubernetes/overlays/prod/ ``` --- ## VPS Provisioning ### Recommended Configuration **For 10-tenant pilot program:** - **RAM:** 20 GB - **CPU:** 8 vCPU cores - **Storage:** 200 GB NVMe SSD (triple replica) - **Network:** 1 Gbps connection - **OS:** Ubuntu 22.04 LTS - **Monthly Cost:** €40-80 (check current pricing) ### Why These Specs? **Memory Breakdown:** - Application services: 14.1 GB - Databases (18 instances): 4.6 GB - Infrastructure (Redis, RabbitMQ): 0.8 GB - Gateway/Frontend: 1.8 GB - Monitoring: 1.5 GB - System overhead: ~3 GB - **Total:** ~26 GB capacity needed, 20 GB is sufficient with HPA **Storage Breakdown:** - Databases: 36 GB (18 × 2GB) - ML Models: 10 GB - Redis: 1 GB - RabbitMQ: 2 GB - Prometheus metrics: 20 GB - Container images: ~30 GB - Growth buffer: 100 GB - **Total:** 199 GB ### Provisioning Steps 1. **Create VPS at clouding.io:** ``` 1. Log in to clouding.io dashboard 2. Click "Create New Server" 3. Select: - OS: Ubuntu 22.04 LTS - RAM: 20 GB - CPU: 8 vCPU - Storage: 200 GB NVMe SSD - Location: Barcelona (best for Spain) 4. Set hostname: bakery-ia-prod-01 5. Add SSH key (or use password) 6. Create server ``` 2. **Note your server details:** ```bash # Save these for later: VPS_IP="YOUR_VPS_IP_ADDRESS" VPS_ROOT_PASSWORD="YOUR_ROOT_PASSWORD" # If not using SSH key ``` 3. **Initial SSH connection:** ```bash # Test connection ssh root@$VPS_IP # Update system apt update && apt upgrade -y ``` --- ## Infrastructure Setup ### Step 1: Install MicroK8s **Using MicroK8s for production VPS deployment on clouding.io** ```bash # SSH into your VPS ssh root@$VPS_IP # Update system apt update && apt upgrade -y # Install MicroK8s snap install microk8s --classic --channel=1.28/stable # Add your user to microk8s group usermod -a -G microk8s $USER chown -f -R $USER ~/.kube newgrp microk8s # Verify installation microk8s status --wait-ready ``` ### Step 2: Enable Required MicroK8s Addons **All required components are available as MicroK8s addons:** ```bash # Enable core addons microk8s enable dns # DNS resolution within cluster microk8s enable hostpath-storage # Provides microk8s-hostpath storage class microk8s enable ingress # Nginx ingress controller microk8s enable cert-manager # Let's Encrypt SSL certificates microk8s enable metrics-server # For HPA autoscaling microk8s enable rbac # Role-based access control # Setup kubectl alias echo "alias kubectl='microk8s kubectl'" >> ~/.bashrc source ~/.bashrc # Verify all components are running kubectl get nodes # Should show: Ready kubectl get storageclass # Should show: microk8s-hostpath (default) kubectl get pods -A # Should show pods in: kube-system, ingress-nginx, cert-manager namespaces # Verify metrics-server is working kubectl top nodes # Should return CPU/Memory metrics ``` **Optional but Recommended:** ```bash # Enable Prometheus for additional monitoring (optional) microk8s enable prometheus # Enable registry if you want local image storage (optional) microk8s enable registry ``` ### Step 3: Configure Firewall ```bash # Allow necessary ports ufw allow 22/tcp # SSH ufw allow 80/tcp # HTTP ufw allow 443/tcp # HTTPS ufw allow 16443/tcp # Kubernetes API (optional) # Enable firewall ufw enable # Check status ufw status verbose ``` ### Step 4: Create Namespace ```bash # Create bakery-ia namespace kubectl create namespace bakery-ia # Verify kubectl get namespaces ``` --- ## Domain & DNS Configuration ### Step 1: Register Domain 1. Go to Namecheap or Cloudflare Registrar 2. Search for your desired domain 3. Complete purchase (~€10-15/year) 4. Save domain credentials ### Step 2: Configure Cloudflare DNS (Recommended) 1. **Add site to Cloudflare:** ``` 1. Log in to Cloudflare 2. Click "Add a Site" 3. Enter your domain name 4. Choose Free plan 5. Cloudflare will scan existing DNS records ``` 2. **Update nameservers at registrar:** ``` Point your domain's nameservers to Cloudflare: - NS1: assigned.cloudflare.com - NS2: assigned.cloudflare.com (Cloudflare will provide the exact values) ``` 3. **Add DNS records:** ``` Type Name Content TTL Proxy A @ YOUR_VPS_IP Auto Yes A www YOUR_VPS_IP Auto Yes A api YOUR_VPS_IP Auto Yes A monitoring YOUR_VPS_IP Auto Yes CNAME * yourdomain.com Auto No ``` 4. **Configure SSL/TLS mode:** ``` SSL/TLS tab → Overview → Set to "Full (strict)" ``` 5. **Test DNS propagation:** ```bash # Wait 5-10 minutes, then test nslookup yourdomain.com nslookup api.yourdomain.com ``` --- ## TLS/SSL Certificates ### Understanding Certificate Setup The platform uses **two layers** of SSL/TLS: 1. **External (Ingress) SSL:** Let's Encrypt for public HTTPS 2. **Internal (Database) SSL:** Self-signed certificates for database connections ### Step 1: Generate Internal Certificates ```bash # On your local machine cd infrastructure/tls # Generate certificates ./generate-certificates.sh # This creates: # - ca/ (Certificate Authority) # - postgres/ (PostgreSQL server certs) # - redis/ (Redis server certs) ``` **Certificate Details:** - Root CA: 10-year validity (expires 2035) - Server certs: 3-year validity (expires October 2028) - Algorithm: RSA 4096-bit - Signature: SHA-256 ### Step 2: Create Kubernetes Secrets ```bash # Create PostgreSQL TLS secret kubectl create secret generic postgres-tls \ --from-file=server-cert.pem=infrastructure/tls/postgres/server-cert.pem \ --from-file=server-key.pem=infrastructure/tls/postgres/server-key.pem \ --from-file=ca-cert.pem=infrastructure/tls/postgres/ca-cert.pem \ -n bakery-ia # Create Redis TLS secret kubectl create secret generic redis-tls \ --from-file=redis-cert.pem=infrastructure/tls/redis/redis-cert.pem \ --from-file=redis-key.pem=infrastructure/tls/redis/redis-key.pem \ --from-file=ca-cert.pem=infrastructure/tls/redis/ca-cert.pem \ -n bakery-ia # Verify secrets created kubectl get secrets -n bakery-ia | grep tls ``` ### Step 3: Configure Let's Encrypt (External SSL) cert-manager is already enabled. Configure the ClusterIssuer: ```bash # On VPS, create ClusterIssuer cat < /tmp/update-tags.sh <<'EOF' #!/bin/bash VERSION="${1:-1.0.0}" sed -i "s/newTag: latest/newTag: v${VERSION}/g" infrastructure/kubernetes/overlays/prod/kustomization.yaml EOF chmod +x /tmp/update-tags.sh /tmp/update-tags.sh ${VERSION} # Verify no 'latest' tags remain grep "newTag:" infrastructure/kubernetes/overlays/prod/kustomization.yaml | grep -c "latest" # Should return: 0 ``` **Step 2.3: Fix SigNoz Namespace References** ```bash # Update SigNoz patches to use bakery-ia namespace instead of signoz sed -i 's/namespace: signoz/namespace: bakery-ia/g' infrastructure/kubernetes/overlays/prod/kustomization.yaml # Verify changes (should show bakery-ia in all 3 patches) grep -A 3 "name: signoz" infrastructure/kubernetes/overlays/prod/kustomization.yaml ``` **Step 2.4: Update Cert-Manager Email** ```bash # Update Let's Encrypt notification email to your production email sed -i "s/admin@bakery-ia.local/admin@bakewise.ai/g" \ infrastructure/kubernetes/base/components/cert-manager/cluster-issuer-production.yaml ``` **Step 2.5: Generate and Update Production Secrets** ```bash # Generate JWT secrets export JWT_SECRET=$(openssl rand -base64 32) export JWT_REFRESH_SECRET=$(openssl rand -base64 32) export SERVICE_API_KEY=$(openssl rand -hex 32) # Display base64-encoded values for secrets.yaml echo "=== JWT Secrets (copy these to secrets.yaml) ===" echo "JWT_SECRET_KEY: $(echo -n $JWT_SECRET | base64)" echo "JWT_REFRESH_SECRET_KEY: $(echo -n $JWT_REFRESH_SECRET | base64)" echo "SERVICE_API_KEY: $(echo -n $SERVICE_API_KEY | base64)" echo "" # Generate strong database passwords for all 18 databases echo "=== Database Passwords (copy these to secrets.yaml) ===" for db in auth tenant inventory sales orders procurement forecasting analytics notification monitoring users products recipes stock menu demo_session orchestrator cleanup; do password=$(openssl rand -base64 24) echo "${db}_DB_PASSWORD: $(echo -n $password | base64)" done # Now manually update infrastructure/kubernetes/base/secrets.yaml with the generated values nano infrastructure/kubernetes/base/secrets.yaml ``` **Production URLs:** - **Main Application:** https://bakewise.ai - **API Endpoints:** https://bakewise.ai/api/v1/... - **SigNoz (Monitoring):** https://monitoring.bakewise.ai/signoz - **AlertManager:** https://monitoring.bakewise.ai/alertmanager --- ## Configuration & Secrets ### Step 1: Generate Strong Passwords ```bash # Generate passwords for all services openssl rand -base64 32 # For each database openssl rand -hex 32 # For JWT secrets and API keys # Save all passwords securely! # Recommended: Use a password manager (1Password, LastPass, Bitwarden) ``` ### Step 2: Update Application Secrets ```bash # Edit the secrets file nano infrastructure/kubernetes/base/secrets.yaml # Update ALL of these values: # Database passwords (14 databases): AUTH_DB_PASSWORD: TENANT_DB_PASSWORD: # ... (all 14 databases) # Redis password: REDIS_PASSWORD: # JWT secrets: JWT_SECRET_KEY: JWT_REFRESH_SECRET_KEY: # SMTP settings (from email setup): SMTP_HOST: # smtp.zoho.com or smtp.gmail.com SMTP_PORT: # 587 SMTP_USERNAME: # your email SMTP_PASSWORD: # app password DEFAULT_FROM_EMAIL: # noreply@yourdomain.com # WhatsApp credentials (from WhatsApp setup): WHATSAPP_ACCESS_TOKEN: WHATSAPP_PHONE_NUMBER_ID: WHATSAPP_BUSINESS_ACCOUNT_ID: WHATSAPP_WEBHOOK_VERIFY_TOKEN: # Database connection strings (update with actual passwords): AUTH_DATABASE_URL: postgresql+asyncpg://auth_user:PASSWORD@auth-db:5432/auth_db?ssl=require # ... (all 14 databases) ``` **To base64 encode:** ```bash echo -n "your-password-here" | base64 ``` **CRITICAL:** Never commit real secrets to git! Use `.gitignore` for secrets files. ### Step 3: Apply Application Secrets ```bash # Copy manifests to VPS (from local machine) scp -r infrastructure/kubernetes root@YOUR_VPS_IP:~/ # SSH to VPS ssh root@YOUR_VPS_IP # Apply application secrets kubectl apply -f ~/infrastructure/kubernetes/base/secrets.yaml -n bakery-ia # Verify secrets created kubectl get secrets -n bakery-ia # Should show multiple secrets including postgres-tls, redis-tls, app-secrets, etc. ``` --- ## Database Migrations ### Step 0: Deploy SigNoz Monitoring (BEFORE Application) **⚠️ CRITICAL:** SigNoz must be deployed BEFORE the application into the **bakery-ia namespace** because the production kustomization patches SigNoz resources. ```bash # On VPS # 1. Ensure bakery-ia namespace exists kubectl get namespace bakery-ia || kubectl create namespace bakery-ia # 2. Add Helm repo helm repo add signoz https://charts.signoz.io helm repo update # 3. Install SigNoz into bakery-ia namespace (NOT separate signoz namespace) helm install signoz signoz/signoz \ -n bakery-ia \ --set frontend.service.type=ClusterIP \ --set clickhouse.persistence.size=20Gi \ --set clickhouse.persistence.storageClass=microk8s-hostpath # 4. Wait for SigNoz to be ready (this may take 10-15 minutes) kubectl wait --for=condition=ready pod \ -l app.kubernetes.io/instance=signoz \ -n bakery-ia \ --timeout=900s # 5. Verify SigNoz components running in bakery-ia namespace kubectl get pods -n bakery-ia -l app.kubernetes.io/instance=signoz # Should show: signoz-0, signoz-otel-collector, signoz-clickhouse, signoz-zookeeper, signoz-alertmanager # 6. Verify StatefulSets exist (kustomization will patch these) kubectl get statefulset -n bakery-ia | grep signoz # Should show: signoz, signoz-clickhouse ``` **⚠️ Important:** Do NOT create a separate `signoz` namespace. SigNoz must be in `bakery-ia` namespace for the overlays to work correctly. ### Step 1: Deploy Application and Databases ```bash # On VPS kubectl apply -k ~/infrastructure/kubernetes/overlays/prod # Wait for databases to be ready (5-10 minutes) kubectl wait --for=condition=ready pod \ -l app.kubernetes.io/component=database \ -n bakery-ia \ --timeout=600s # Check status kubectl get pods -n bakery-ia -l app.kubernetes.io/component=database ``` ### Step 2: Run Migrations Migrations are automatically handled by init containers in each service. Verify they completed: ```bash # Check migration job status kubectl get jobs -n bakery-ia | grep migration # All should show "COMPLETIONS = 1/1" # Check logs if any failed kubectl logs -n bakery-ia job/auth-migration ``` ### Step 3: Verify Database Schemas ```bash # Connect to a database to verify kubectl exec -n bakery-ia deployment/auth-db -it -- psql -U auth_user -d auth_db # Inside psql: \dt # List tables \d users # Describe users table \q # Quit ``` --- ## Verification & Testing ### Step 1: Check All Pods Running ```bash # View all pods kubectl get pods -n bakery-ia # Expected: All pods in "Running" state, none in CrashLoopBackOff # Check for issues kubectl get pods -n bakery-ia | grep -vE "Running|Completed" # View logs for any problematic pods kubectl logs -n bakery-ia POD_NAME ``` ### Step 2: Check Services and Ingress ```bash # View services kubectl get svc -n bakery-ia # View ingress kubectl get ingress -n bakery-ia # View certificates (should auto-issue from Let's Encrypt) kubectl get certificate -n bakery-ia # Describe certificate to check status kubectl describe certificate bakery-ia-prod-tls-cert -n bakery-ia ``` ### Step 3: Test Database Connections ```bash # Test PostgreSQL TLS kubectl exec -n bakery-ia deployment/auth-db -- sh -c \ 'psql -U auth_user -d auth_db -c "SHOW ssl;"' # Expected output: on # Test Redis TLS kubectl exec -n bakery-ia deployment/redis -- redis-cli \ --tls \ --cert /tls/redis-cert.pem \ --key /tls/redis-key.pem \ --cacert /tls/ca-cert.pem \ -a $REDIS_PASSWORD \ ping # Expected output: PONG ``` ### Step 4: Test Frontend Access ```bash # Test frontend (replace with your domain) curl -I https://bakery.yourdomain.com # Expected: HTTP/2 200 OK # Test API health curl https://api.yourdomain.com/health # Expected: {"status": "healthy"} ``` ### Step 5: Test Authentication ```bash # Create a test user (using your frontend or API) curl -X POST https://api.yourdomain.com/api/v1/auth/register \ -H "Content-Type: application/json" \ -d '{ "email": "test@yourdomain.com", "password": "TestPassword123!", "name": "Test User" }' # Login curl -X POST https://api.yourdomain.com/api/v1/auth/login \ -H "Content-Type: application/json" \ -d '{ "email": "test@yourdomain.com", "password": "TestPassword123!" }' # Expected: JWT token in response ``` ### Step 6: Test Email Delivery ```bash # Trigger a password reset to test email curl -X POST https://api.yourdomain.com/api/v1/auth/forgot-password \ -H "Content-Type: application/json" \ -d '{"email": "test@yourdomain.com"}' # Check your email inbox for the reset link # Check service logs if email not received: kubectl logs -n bakery-ia deployment/auth-service | grep -i "email\|smtp" ``` ### Step 7: Test WhatsApp (Optional) ```bash # Send a test WhatsApp message # This requires creating a tenant and configuring WhatsApp in the UI # Or test via API once authenticated ``` --- ## Post-Deployment ### Step 1: Access SigNoz Monitoring Stack Your production deployment includes **SigNoz**, a unified observability platform that provides complete visibility into your application: #### What is SigNoz? SigNoz is an **open-source, all-in-one observability platform** that provides: - **📊 Distributed Tracing** - See end-to-end request flows across all 18 microservices - **📈 Metrics Monitoring** - Application performance and infrastructure metrics - **📝 Log Management** - Centralized logs from all services with trace correlation - **🔍 Service Performance Monitoring (SPM)** - Automatic RED metrics (Rate, Error, Duration) - **🗄️ Database Monitoring** - All 18 PostgreSQL databases + Redis + RabbitMQ - **☸️ Kubernetes Monitoring** - Cluster, node, pod, and container metrics **Why SigNoz instead of Prometheus/Grafana?** - Single unified UI for traces, metrics, and logs (no context switching) - Automatic service dependency mapping - Built-in APM (Application Performance Monitoring) - Log-trace correlation with one click - Better query performance with ClickHouse backend - Modern UI designed for microservices #### Production Monitoring URLs Access via domain: ``` https://monitoring.bakewise.ai/signoz # SigNoz - Main observability UI https://monitoring.bakewise.ai/alertmanager # AlertManager - Alert management ``` Or via port forwarding (if needed): ```bash # SigNoz Frontend (Main UI) kubectl port-forward -n bakery-ia svc/signoz 8080:8080 & # Open: http://localhost:8080 # SigNoz AlertManager kubectl port-forward -n bakery-ia svc/signoz-alertmanager 9093:9093 & # Open: http://localhost:9093 # OTel Collector (for debugging) kubectl port-forward -n bakery-ia svc/signoz-otel-collector 4317:4317 & # gRPC kubectl port-forward -n bakery-ia svc/signoz-otel-collector 4318:4318 & # HTTP ``` #### Key SigNoz Features to Explore Once you open SigNoz (https://monitoring.bakewise.ai/signoz), explore these tabs: **1. Services Tab - Application Performance** - View all 18 microservices with live metrics - See request rate, error rate, and latency (P50/P90/P99) - Click on any service to drill down into operations - Identify slow endpoints and error-prone operations **2. Traces Tab - Request Flow Visualization** - See complete request journeys across services - Identify bottlenecks (slow database queries, API calls) - Debug errors with full stack traces - Correlate with logs for complete context **3. Dashboards Tab - Infrastructure & Database Metrics** - **PostgreSQL** - Monitor all 18 databases (connections, queries, cache hit ratio) - **Redis** - Cache performance (memory, hit rate, commands/sec) - **RabbitMQ** - Message queue health (depth, rates, consumers) - **Kubernetes** - Cluster metrics (nodes, pods, containers) **4. Logs Tab - Centralized Log Management** - Search and filter logs from all services - Click on trace ID in logs to see related request trace - Auto-enriched with Kubernetes metadata (pod, namespace, container) - Identify patterns and anomalies **5. Alerts Tab - Proactive Monitoring** - Configure alerts on metrics, traces, or logs - Email/Slack/Webhook notifications - View firing alerts and alert history #### Quick Health Check ```bash # Verify SigNoz components are running kubectl get pods -n bakery-ia -l app.kubernetes.io/instance=signoz # Expected output: # signoz-0 READY 1/1 # signoz-otel-collector-xxx READY 1/1 # signoz-alertmanager-xxx READY 1/1 # signoz-clickhouse-xxx READY 1/1 # signoz-zookeeper-xxx READY 1/1 # Check OTel Collector health kubectl exec -n bakery-ia deployment/signoz-otel-collector -- wget -qO- http://localhost:13133 # View recent telemetry in OTel Collector logs kubectl logs -n bakery-ia deployment/signoz-otel-collector --tail=50 | grep -i "traces\|metrics\|logs" ``` #### Verify Telemetry is Working 1. **Check Services are Reporting:** ```bash # Open SigNoz and navigate to Services tab # You should see all 18 microservices listed # If services are missing, check if they're sending telemetry: kubectl logs -n bakery-ia deployment/auth-service | grep -i "telemetry\|otel" ``` 2. **Check Database Metrics:** ```bash # Navigate to Dashboards → PostgreSQL in SigNoz # You should see metrics from all 18 databases # Verify OTel Collector is scraping databases: kubectl logs -n bakery-ia deployment/signoz-otel-collector | grep postgresql ``` 3. **Check Traces are Being Collected:** ```bash # Make a test API request curl https://bakewise.ai/api/v1/health # Navigate to Traces tab in SigNoz # Search for "gateway" service # You should see the trace for your request ``` 4. **Check Logs are Being Collected:** ```bash # Navigate to Logs tab in SigNoz # Filter by namespace: bakery-ia # You should see logs from all pods # Verify filelog receiver is working: kubectl logs -n bakery-ia deployment/signoz-otel-collector | grep filelog ``` ### Step 2: Configure Alerting SigNoz includes integrated alerting with AlertManager. Configure it for your team: #### Update Email Notification Settings The alerting configuration is in the SigNoz Helm values. To update: ```bash # For production, edit the values file: nano infrastructure/helm/signoz-values-prod.yaml # Update the alertmanager.config section: # 1. Update SMTP settings: # - smtp_from: 'your-alerts@bakewise.ai' # - smtp_auth_username: 'your-alerts@bakewise.ai' # - smtp_auth_password: (use Kubernetes secret) # # 2. Update receivers: # - critical-alerts email: critical-alerts@bakewise.ai # - warning-alerts email: oncall@bakewise.ai # # 3. (Optional) Add Slack webhook for critical alerts # Apply the updated configuration: helm upgrade signoz signoz/signoz \ -n bakery-ia \ -f infrastructure/helm/signoz-values-prod.yaml ``` #### Create Alerts in SigNoz UI 1. **Open SigNoz Alerts Tab:** ``` https://monitoring.bakewise.ai/signoz → Alerts ``` 2. **Create Common Alerts:** **Alert 1: High Error Rate** - Name: `HighErrorRate` - Query: `error_rate > 5` for `5 minutes` - Severity: `critical` - Description: "Service {{service_name}} has error rate >5%" **Alert 2: High Latency** - Name: `HighLatency` - Query: `P99_latency > 3000ms` for `5 minutes` - Severity: `warning` - Description: "Service {{service_name}} P99 latency >3s" **Alert 3: Service Down** - Name: `ServiceDown` - Query: `request_rate == 0` for `2 minutes` - Severity: `critical` - Description: "Service {{service_name}} not receiving requests" **Alert 4: Database Connection Issues** - Name: `DatabaseConnectionsHigh` - Query: `pg_active_connections > 80` for `5 minutes` - Severity: `warning` - Description: "Database {{database}} connection count >80%" **Alert 5: High Memory Usage** - Name: `HighMemoryUsage` - Query: `container_memory_percent > 85` for `5 minutes` - Severity: `warning` - Description: "Pod {{pod_name}} using >85% memory" #### Test Alert Delivery ```bash # Method 1: Create a test alert in SigNoz UI # Go to Alerts → New Alert → Set a test condition that will fire # Method 2: Fire a test alert via stress test kubectl run memory-test --image=polinux/stress --restart=Never \ --namespace=bakery-ia -- stress --vm 1 --vm-bytes 600M --timeout 300s # Check alert appears in SigNoz Alerts tab # https://monitoring.bakewise.ai/signoz → Alerts # Also check AlertManager # https://monitoring.bakewise.ai/alertmanager # Verify email notification received # Clean up test kubectl delete pod memory-test -n bakery-ia ``` #### Configure Notification Channels In SigNoz Alerts tab, configure channels: 1. **Email Channel:** - Already configured via AlertManager - Emails sent to addresses in signoz-values-prod.yaml 2. **Slack Channel (Optional):** ```bash # Add Slack webhook URL to signoz-values-prod.yaml # Under alertmanager.config.receivers.critical-alerts.slack_configs: # - api_url: 'https://hooks.slack.com/services/YOUR/WEBHOOK/URL' # channel: '#alerts-critical' ``` 3. **Webhook Channel (Optional):** - Configure custom webhook for integration with PagerDuty, OpsGenie, etc. - Add to alertmanager.config.receivers ### Step 3: Configure Backups ```bash # Create backup script on VPS cat > ~/backup-databases.sh <<'EOF' #!/bin/bash BACKUP_DIR="/backups/$(date +%Y-%m-%d)" mkdir -p $BACKUP_DIR # Get all database pods DBS=$(kubectl get pods -n bakery-ia -l app.kubernetes.io/component=database -o name) for db in $DBS; do DB_NAME=$(echo $db | cut -d'/' -f2) echo "Backing up $DB_NAME..." kubectl exec -n bakery-ia $db -- pg_dump -U postgres > "$BACKUP_DIR/${DB_NAME}.sql" done # Compress backups tar -czf "$BACKUP_DIR.tar.gz" "$BACKUP_DIR" rm -rf "$BACKUP_DIR" # Keep only last 7 days find /backups -name "*.tar.gz" -mtime +7 -delete echo "Backup completed: $BACKUP_DIR.tar.gz" EOF chmod +x ~/backup-databases.sh # Test backup ./backup-databases.sh # Setup daily cron job (2 AM) (crontab -l 2>/dev/null; echo "0 2 * * * ~/backup-databases.sh") | crontab - ``` ### Step 3: Setup Alerting ```bash # Update AlertManager configuration with your email kubectl edit configmap -n monitoring alertmanager-config # Update recipient emails in the routes section ``` ### Step 4: Verify SigNoz Monitoring is Working Before proceeding, ensure all monitoring components are operational: ```bash # 1. Verify SigNoz pods are running kubectl get pods -n bakery-ia -l app.kubernetes.io/instance=signoz # Expected pods (all should be Running/Ready): # - signoz-0 (or signoz-1, signoz-2 for HA) # - signoz-otel-collector-xxx # - signoz-alertmanager-xxx # - signoz-clickhouse-xxx # - signoz-zookeeper-xxx # 2. Check SigNoz UI is accessible curl -I https://monitoring.bakewise.ai/signoz # Should return: HTTP/2 200 OK # 3. Verify OTel Collector is receiving data kubectl logs -n bakery-ia deployment/signoz-otel-collector --tail=100 | grep -i "received" # Should show: "Traces received: X" "Metrics received: Y" "Logs received: Z" # 4. Check ClickHouse database is healthy kubectl exec -n bakery-ia deployment/signoz-clickhouse -- clickhouse-client --query="SELECT count() FROM system.tables WHERE database LIKE 'signoz_%'" # Should return a number > 0 (tables exist) ``` **Complete Verification Checklist:** - [ ] **SigNoz UI loads** at https://monitoring.bakewise.ai/signoz - [ ] **Services tab shows all 18 microservices** with metrics - [ ] **Traces tab has sample traces** from gateway and other services - [ ] **Dashboards tab shows PostgreSQL metrics** from all 18 databases - [ ] **Dashboards tab shows Redis metrics** (memory, commands, etc.) - [ ] **Dashboards tab shows RabbitMQ metrics** (queues, messages) - [ ] **Dashboards tab shows Kubernetes metrics** (nodes, pods) - [ ] **Logs tab displays logs** from all services in bakery-ia namespace - [ ] **Alerts tab is accessible** and can create new alerts - [ ] **AlertManager** is reachable at https://monitoring.bakewise.ai/alertmanager **If any checks fail, troubleshoot:** ```bash # Check OTel Collector configuration kubectl describe configmap -n bakery-ia signoz-otel-collector # Check for errors in OTel Collector kubectl logs -n bakery-ia deployment/signoz-otel-collector | grep -i error # Check ClickHouse is accepting writes kubectl logs -n bakery-ia deployment/signoz-clickhouse | grep -i error # Restart OTel Collector if needed kubectl rollout restart deployment/signoz-otel-collector -n bakery-ia ``` ### Step 5: Document Everything Create a secure runbook with all credentials and procedures: **Essential Information to Document:** - [ ] VPS login credentials (stored securely in password manager) - [ ] Database passwords (in password manager) - [ ] Grafana admin password - [ ] Domain registrar access (for bakewise.ai) - [ ] Cloudflare access - [ ] Email service credentials (SMTP) - [ ] WhatsApp API credentials - [ ] Docker Hub / Registry credentials - [ ] Emergency contact information - [ ] Rollback procedures - [ ] Monitoring URLs and access procedures ### Step 6: Train Your Team Conduct a training session covering SigNoz and operational procedures: #### Part 1: SigNoz Navigation (30 minutes) - [ ] **Login and Overview** - Show how to access https://monitoring.bakewise.ai/signoz - Navigate through main tabs: Services, Traces, Dashboards, Logs, Alerts - Explain the unified nature of SigNoz (all-in-one platform) - [ ] **Services Tab - Application Performance Monitoring** - Show all 18 microservices - Explain RED metrics (Request rate, Error rate, Duration/latency) - Demo: Click on a service → Operations → See endpoint breakdown - Demo: Identify slow endpoints and high error rates - [ ] **Traces Tab - Request Flow Debugging** - Show how to search for traces by service, operation, or time - Demo: Click on a trace → See full waterfall (service → database → cache) - Demo: Find slow database queries in trace spans - Demo: Click "View Logs" to correlate trace with logs - [ ] **Dashboards Tab - Infrastructure Monitoring** - Navigate to PostgreSQL dashboard → Show all 18 databases - Navigate to Redis dashboard → Show cache metrics - Navigate to Kubernetes dashboard → Show node/pod metrics - Explain what metrics indicate issues (connection %, memory %, etc.) - [ ] **Logs Tab - Log Search and Analysis** - Show how to filter by service, severity, time range - Demo: Search for "error" in last hour - Demo: Click on trace_id in log → Jump to related trace - Show Kubernetes metadata (pod, namespace, container) - [ ] **Alerts Tab - Proactive Monitoring** - Show how to create alerts on metrics - Review pre-configured alerts - Show alert history and firing alerts - Explain how to acknowledge/silence alerts #### Part 2: Operational Tasks (30 minutes) - [ ] **Check application logs** (multiple ways) ```bash # Method 1: Via kubectl (for immediate debugging) kubectl logs -n bakery-ia deployment/orders-service --tail=100 -f # Method 2: Via SigNoz Logs tab (for analysis and correlation) # 1. Open https://monitoring.bakewise.ai/signoz → Logs # 2. Filter by k8s_deployment_name: orders-service # 3. Click on trace_id to see related request flow ``` - [ ] **Restart services when needed** ```bash # Restart a service (rolling update, no downtime) kubectl rollout restart deployment/orders-service -n bakery-ia # Verify restart in SigNoz: # 1. Check Services tab → orders-service → Should show brief dip then recovery # 2. Check Logs tab → Filter by orders-service → See restart logs ``` - [ ] **Investigate performance issues** ```bash # Scenario: "Orders API is slow" # 1. SigNoz → Services → orders-service → Check P99 latency # 2. SigNoz → Traces → Filter service:orders-service, duration:>1s # 3. Click on slow trace → Identify bottleneck (DB query? External API?) # 4. SigNoz → Dashboards → PostgreSQL → Check orders_db connections/queries # 5. Fix identified issue (add index, optimize query, scale service) ``` - [ ] **Respond to alerts** - Show how to access alerts in SigNoz → Alerts tab - Show AlertManager UI at https://monitoring.bakewise.ai/alertmanager - Review common alerts and their resolution steps - Reference the [Production Operations Guide](./PRODUCTION_OPERATIONS_GUIDE.md) #### Part 3: Documentation and Resources (10 minutes) - [ ] **Share documentation** - [PILOT_LAUNCH_GUIDE.md](./PILOT_LAUNCH_GUIDE.md) - This guide (deployment) - [PRODUCTION_OPERATIONS_GUIDE.md](./PRODUCTION_OPERATIONS_GUIDE.md) - Daily operations with SigNoz - [security-checklist.md](./security-checklist.md) - Security procedures - [ ] **Bookmark key URLs** - SigNoz: https://monitoring.bakewise.ai/signoz - AlertManager: https://monitoring.bakewise.ai/alertmanager - Production app: https://bakewise.ai - [ ] **Setup on-call rotation** (if applicable) - Configure rotation schedule in AlertManager - Document escalation procedures - Test alert delivery to on-call phone/email #### Part 4: Hands-On Exercise (15 minutes) **Exercise: Investigate a Simulated Issue** 1. Create a load test to generate traffic 2. Use SigNoz to find the slowest endpoint 3. Identify the root cause using traces 4. Correlate with logs to confirm 5. Check infrastructure metrics (DB, memory, CPU) 6. Propose a fix based on findings This trains the team to use SigNoz effectively for real incidents. --- ## Troubleshooting ### Issue: Pods Not Starting ```bash # Check pod status kubectl describe pod POD_NAME -n bakery-ia # Common causes: # 1. Image pull errors kubectl get events -n bakery-ia | grep -i "pull" # 2. Resource limits kubectl describe node # 3. Volume mount issues kubectl get pvc -n bakery-ia ``` ### Issue: Certificate Not Issuing ```bash # Check certificate status kubectl describe certificate bakery-ia-prod-tls-cert -n bakery-ia # Check cert-manager logs kubectl logs -n cert-manager deployment/cert-manager # Check challenges kubectl get challenges -n bakery-ia # Verify DNS is correct nslookup bakery.yourdomain.com ``` ### Issue: Database Connection Errors ```bash # Check database pod kubectl get pods -n bakery-ia -l app.kubernetes.io/component=database # Check database logs kubectl logs -n bakery-ia deployment/auth-db # Test connection from service pod kubectl exec -n bakery-ia deployment/auth-service -- nc -zv auth-db 5432 ``` ### Issue: Services Can't Connect to Databases ```bash # Check if SSL is enabled kubectl exec -n bakery-ia deployment/auth-db -- sh -c \ 'psql -U auth_user -d auth_db -c "SHOW ssl;"' # Check service logs for SSL errors kubectl logs -n bakery-ia deployment/auth-service | grep -i "ssl\|tls" # Restart service to pick up new SSL config kubectl rollout restart deployment/auth-service -n bakery-ia ``` ### Issue: Out of Resources ```bash # Check node resources kubectl top nodes # Check pod resource usage kubectl top pods -n bakery-ia # Identify resource hogs kubectl top pods -n bakery-ia --sort-by=memory # Scale down non-critical services temporarily kubectl scale deployment monitoring -n bakery-ia --replicas=0 ``` --- ## Next Steps After Successful Launch 1. **Monitor for 48 Hours** - Check dashboards daily - Review error logs - Monitor resource usage - Test all functionality 2. **Optimize Based on Metrics** - Adjust resource limits if needed - Fine-tune autoscaling thresholds - Optimize database queries if slow 3. **Onboard First Tenant** - Create test tenant - Upload sample data - Test all features - Gather feedback 4. **Scale Gradually** - Add 1-2 tenants at a time - Monitor resource usage - Upgrade VPS if needed (see scaling guide) 5. **Plan for Growth** - Review [PRODUCTION_OPERATIONS_GUIDE.md](./PRODUCTION_OPERATIONS_GUIDE.md) - Implement additional monitoring - Plan capacity upgrades - Consider managed services for scale --- ## Cost Scaling Path | Tenants | RAM | CPU | Storage | Monthly Cost | |---------|-----|-----|---------|--------------| | 10 | 20 GB | 8 cores | 200 GB | €40-80 | | 25 | 32 GB | 12 cores | 300 GB | €80-120 | | 50 | 48 GB | 16 cores | 500 GB | €150-200 | | 100+ | Consider multi-node cluster or managed K8s | €300+ | --- ## Support Resources **Documentation:** - **Operations Guide:** [PRODUCTION_OPERATIONS_GUIDE.md](./PRODUCTION_OPERATIONS_GUIDE.md) - Daily operations, monitoring, incident response - **Security Guide:** [security-checklist.md](./security-checklist.md) - Security procedures and compliance - **Database Security:** [database-security.md](./database-security.md) - Database operations and TLS configuration - **TLS Configuration:** [tls-configuration.md](./tls-configuration.md) - Certificate management - **RBAC Implementation:** [rbac-implementation.md](./rbac-implementation.md) - Access control **Monitoring Access:** - **SigNoz (Primary):** https://monitoring.bakewise.ai/signoz - All-in-one observability - Services: Application performance monitoring (APM) - Traces: Distributed tracing across all services - Dashboards: PostgreSQL, Redis, RabbitMQ, Kubernetes metrics - Logs: Centralized log management with trace correlation - Alerts: Alert configuration and management - **AlertManager:** https://monitoring.bakewise.ai/alertmanager - Alert routing and notifications **External Resources:** - **MicroK8s Docs:** https://microk8s.io/docs - **Kubernetes Docs:** https://kubernetes.io/docs - **Let's Encrypt:** https://letsencrypt.org/docs - **Cloudflare DNS:** https://developers.cloudflare.com/dns - **SigNoz Documentation:** https://signoz.io/docs/ - **OpenTelemetry Documentation:** https://opentelemetry.io/docs/ **Monitoring Architecture:** - **OpenTelemetry:** Industry-standard instrumentation framework - Auto-instruments FastAPI, HTTPX, SQLAlchemy, Redis - Collects traces, metrics, and logs from all services - Exports to SigNoz via OTLP protocol (gRPC port 4317, HTTP port 4318) - **SigNoz Components:** - **Frontend:** Web UI for visualization and analysis - **OTel Collector:** Receives and processes telemetry data - **ClickHouse:** Time-series database for fast queries - **AlertManager:** Alert routing and notification delivery - **Zookeeper:** Coordination service for ClickHouse cluster --- ## Summary Checklist ### Pre-Deployment Configuration (LOCAL MACHINE) - [ ] **imagePullSecrets removed** - Deleted from all 67 manifests - [ ] **Image tags updated** - Changed all 'latest' to v1.0.0 (semantic version) - [ ] **SigNoz namespace fixed** - ✅ Already done (bakery-ia namespace) - [ ] **Production secrets generated** - JWT, database passwords, API keys - [ ] **secrets.yaml updated** - Replaced all placeholder values - [ ] **Cert-manager email updated** - admin@bakewise.ai - [ ] **Manifests validated** - No 'latest' tags, no imagePullSecrets remaining ### Infrastructure Setup - [ ] VPS provisioned and accessible - [ ] k3s (or Kubernetes) installed and configured - [ ] nginx-ingress-controller installed - [ ] metrics-server installed and working - [ ] cert-manager installed - [ ] local-path-provisioner installed - [ ] Domain registered and DNS configured - [ ] Cloudflare protection enabled (optional but recommended) ### Secrets and Configuration - [ ] TLS certificates generated (postgres, redis) - [ ] Email service configured and tested - [ ] WhatsApp API setup (optional for launch) - [ ] Container images built and pushed with version tags - [ ] Production configs verified (domains, CORS, storage class) - [ ] Strong passwords generated for all services - [ ] Docker registry secret created (dockerhub-creds) - [ ] Application secrets applied ### Monitoring - [ ] SigNoz deployed via Helm - [ ] SigNoz pods running and healthy - [ ] signoz namespace created ### Application Deployment - [ ] All pods running successfully - [ ] Databases accepting TLS connections - [ ] Let's Encrypt certificates issued - [ ] Frontend accessible via HTTPS - [ ] API health check passing - [ ] Test user can login - [ ] Email delivery working - [ ] SigNoz monitoring accessible - [ ] Metrics flowing to SigNoz ### Post-Deployment - [ ] Backups configured and tested - [ ] Team trained on operations - [ ] Documentation complete - [ ] Emergency procedures documented - [ ] Monitoring alerts configured --- **🎉 Congratulations! Your Bakery-IA platform is now live in production!** *Estimated total time: 2-4 hours for first deployment* *Subsequent updates: 15-30 minutes* --- **Document Version:** 2.0 **Last Updated:** 2026-01-11 **Maintained By:** DevOps Team **Changes in v2.0:** - Added critical pre-deployment fixes section - Updated infrastructure setup for k3s instead of MicroK8s - Added required component installation (nginx-ingress, metrics-server, etc.) - Updated configuration steps with domain replacement - Added Docker registry secret creation - Added SigNoz Helm deployment before application - Updated storage class configuration - Added image tag version requirements - Expanded verification checklist