# Bakery-IA Production Deployment Guide **Complete guide for deploying Bakery-IA to production on a MicroK8s cluster** | **Version** | 4.0 | |-------------|-----| | **Last Updated** | 2026-01-21 | | **Target Environment** | VPS with MicroK8s (Ubuntu 22.04 LTS) | | **Estimated Deployment Time** | 3-5 hours (first-time deployment) | | **Monthly Cost** | ~€41-81 (10-tenant pilot) | --- ## Table of Contents 1. [Quick Start Overview](#quick-start-overview) 2. [Prerequisites](#prerequisites) 3. [Phase 0: Transfer Infrastructure Code to Server](#phase-0-transfer-infrastructure-code-to-server) 4. [Phase 1: VPS Setup & MicroK8s Installation](#phase-1-vps-setup--microk8s-installation) 5. [Phase 2: Domain & DNS Configuration](#phase-2-domain--dns-configuration) 6. [Phase 3: Deploy Foundation Layer](#phase-3-deploy-foundation-layer) 7. [Phase 4: Deploy CI/CD Infrastructure](#phase-4-deploy-cicd-infrastructure) 8. [Phase 5: Pre-Pull and Push Base Images to Gitea Registry](#phase-5-pre-pull-and-push-base-images-to-gitea-registry) - [Step 5.1: Pre-Pull Base Images](#step-51-pre-pull-base-images-and-push-to-registry) - [Step 5.2: Verify Images in Registry](#step-52-verify-images-in-gitea-registry) - [Step 5.3: Troubleshooting](#step-53-troubleshooting-image-issues) - [Step 5.4: Verify CI/CD Access](#step-54-verify-cicd-pipeline-can-access-images) - [Step 5.5: Build Service Images](#step-55-build-and-push-service-images-first-time-deployment-only) - [Step 5.6: Verify Service Images](#step-56-verify-all-service-images-are-available) 9. [Phase 6: Deploy Application Services](#phase-6-deploy-application-services) 10. [Phase 7: Deploy Optional Services](#phase-7-deploy-optional-services) 11. [Phase 8: Verification & Validation](#phase-8-verification--validation) 12. [Post-Deployment Operations](#post-deployment-operations) 13. [Troubleshooting Guide](#troubleshooting-guide) 14. [Reference & Resources](#reference--resources) --- ## Quick Start Overview ### What You're Deploying A complete multi-tenant SaaS platform consisting of: | Component | Details | |-----------|---------| | **Microservices** | 18 Python/FastAPI services | | **Databases** | 18 PostgreSQL instances with TLS | | **Cache** | Redis with TLS | | **Message Broker** | RabbitMQ | | **Object Storage** | MinIO (S3-compatible) | | **Email** | Mailu (self-hosted) with Mailgun relay | | **Monitoring** | SigNoz (unified observability) | | **CI/CD** | Gitea + Tekton + Flux CD | | **Security** | TLS everywhere, RBAC, Network Policies | ### Infrastructure Architecture ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ LAYER 6: APPLICATION │ │ Frontend │ Gateway │ 18 Microservices │ CronJobs & Workers │ ├─────────────────────────────────────────────────────────────────────────────┤ │ LAYER 5: MONITORING │ │ SigNoz (Unified Observability) │ AlertManager │ OTel Collector │ ├─────────────────────────────────────────────────────────────────────────────┤ │ LAYER 4: PLATFORM SERVICES (Optional) │ │ Mailu (Email) │ Nominatim (Geocoding) │ CI/CD (Tekton, Flux, Gitea) │ ├─────────────────────────────────────────────────────────────────────────────┤ │ LAYER 3: DATA & STORAGE │ │ PostgreSQL (18 DBs) │ Redis │ RabbitMQ │ MinIO │ ├─────────────────────────────────────────────────────────────────────────────┤ │ LAYER 2: NETWORK & SECURITY │ │ Unbound DNS │ CoreDNS │ Ingress Controller │ Cert-Manager │ TLS │ ├─────────────────────────────────────────────────────────────────────────────┤ │ LAYER 1: FOUNDATION │ │ Namespaces │ Storage Classes │ RBAC │ ConfigMaps │ Secrets │ ├─────────────────────────────────────────────────────────────────────────────┤ │ LAYER 0: KUBERNETES CLUSTER │ │ MicroK8s (Production) │ Kind (Local Dev) │ EKS (AWS Alternative) │ └─────────────────────────────────────────────────────────────────────────────┘ ``` ### Deployment Order (Critical) Components **must** be deployed in this order due to dependencies: ``` Phase 0: Transfer code to server (bootstrap) ↓ Phase 1: MicroK8s + Addons ↓ Phase 2: DNS + Domain configuration ↓ Phase 3: Foundation (Namespaces, Cert-Manager, TLS) ↓ Phase 4: CI/CD (Gitea → Tekton → Flux) ↓ Phase 5: Base & Service Images (Pre-pull base images, build service images) ↓ Phase 6: Application Services (21 microservices + Gateway + Data Layer) ↓ Phase 7: Optional (Mailu, SigNoz, Nominatim) ↓ Phase 8: Verification & Validation ``` > **Note on First Deployment:** For the first deployment, you must manually build and push service images (Phase 5, Step 5.5) before applying the production kustomization. After the first deployment, the CI/CD pipeline will automatically build and push images on subsequent commits. ### Cost Breakdown | Service | Provider | Monthly Cost | |---------|----------|-------------| | VPS (20GB RAM, 8 vCPU, 200GB SSD) | clouding.io | €40-80 | | Domain | Namecheap/Cloudflare | ~€1.25 (€15/year) | | Email Relay | Mailgun (free tier) | €0 | | SSL Certificates | Let's Encrypt | €0 | | DNS | Cloudflare | €0 | | **Total** | | **€41-81/month** | --- ## Prerequisites ### System Requirements | Requirement | Specification | |-------------|---------------| | **OS** | Ubuntu 22.04 LTS | | **RAM** | Minimum 16GB (20GB recommended) | | **CPU** | 8 vCPU cores | | **Storage** | 200GB NVMe SSD | | **Network** | Static public IP, 1 Gbps | ### Required Accounts - [ ] **VPS Provider** (clouding.io, Hetzner, DigitalOcean, etc.) - [ ] **Domain Registrar** (Namecheap, Cloudflare, etc.) - [ ] **Cloudflare Account** (recommended for DNS) - [ ] **Mailgun Account** (for email relay, optional) - [ ] **Stripe Account** (for payments) ### Local Machine Requirements ```bash # Verify these tools are installed: kubectl version --client # Kubernetes CLI docker --version # Container runtime git --version # Version control ssh -V # SSH client helm version # Helm package manager openssl version # TLS utilities # Install if missing (macOS): brew install kubectl docker git helm openssl # Install if missing (Ubuntu): sudo apt install -y docker.io git openssl sudo snap install kubectl --classic sudo snap install helm --classic ``` ### SSH Configuration (Recommended) Set up SSH config for easier access: ```bash # Create/edit ~/.ssh/config cat >> ~/.ssh/config << 'EOF' Host bakery-vps HostName 200.234.233.87 User root IdentityFile ~/.ssh/bakewise.pem IdentitiesOnly yes EOF # Set proper permissions on key chmod 600 ~/.ssh/bakewise.pem # Test connection ssh bakery-vps ``` --- ## Phase 0: Transfer Infrastructure Code to Server **Problem:** You need the infrastructure code on the server to deploy Gitea, but Gitea is your target repository. ### Option 1: Direct Transfer with rsync (Recommended) This is the **bootstrap approach** - transfer code directly, then push to Gitea once it's running. ```bash # From your LOCAL machine - transfer entire repository rsync -avz --progress \ --exclude='.git' \ --exclude='node_modules' \ --exclude='__pycache__' \ --exclude='.venv' \ --exclude='*.pyc' \ /Users/urtzialfaro/Documents/bakery-ia/ \ bakery-vps:/root/bakery-ia/ # Verify transfer ssh bakery-vps "ls -la /root/bakery-ia/infrastructure/" ``` ### Option 2: SCP Tarball Transfer ```bash # Create a tarball locally (excludes unnecessary files) cd /Users/urtzialfaro/Documents/bakery-ia tar -czvf /tmp/bakery-ia-infra.tar.gz \ --exclude='.git' \ --exclude='node_modules' \ --exclude='__pycache__' \ --exclude='.venv' \ infrastructure/ \ PRODUCTION_DEPLOYMENT_GUIDE.md \ docs/ # Transfer to server scp /tmp/bakery-ia-infra.tar.gz bakery-vps:/root/ # On server - extract ssh bakery-vps "cd /root && tar -xzvf bakery-ia-infra.tar.gz" ``` ### Option 3: Temporary GitHub/GitLab (If Needed) Use if rsync/scp are not available: 1. Push to a **temporary private** GitHub/GitLab repo 2. Clone on the server 3. After Gitea is running, migrate the repo to Gitea 4. Delete the temporary remote repo ### After Transfer - Push to Gitea (Post Phase 5) Once Gitea is deployed (Phase 5), push the full repo: ```bash # On the SERVER after Gitea is running cd /root/bakery-ia git init git add . git commit -m "Initial commit - production deployment" git remote add origin https://gitea.bakewise.ai/bakery-admin/bakery-ia.git git push -u origin main ``` --- ## Phase 1: VPS Setup & MicroK8s Installation ### Step 1.1: Initial Server Setup ```bash # SSH into your VPS ssh bakery-vps # Update system apt update && apt upgrade -y # Set hostname hostnamectl set-hostname bakery-ia-prod # Install essential tools apt install -y curl wget git jq openssl ``` ### Step 1.2: Install MicroK8s ```bash # Install MicroK8s (stable channel) snap install microk8s --classic --channel=1.28/stable # Add user to microk8s group usermod -a -G microk8s $USER chown -f -R $USER ~/.kube newgrp microk8s # Wait for MicroK8s to be ready microk8s status --wait-ready ``` ### Step 1.3: Enable Required Addons ```bash # Enable core addons (in order) microk8s enable dns # DNS resolution microk8s enable hostpath-storage # Storage provisioner microk8s enable ingress # NGINX ingress (class: "public") microk8s enable cert-manager # Let's Encrypt certificates microk8s enable metrics-server # HPA autoscaling microk8s enable rbac # Role-based access control # Optional but recommended microk8s enable prometheus # Metrics collection # Setup kubectl alias echo "alias kubectl='microk8s kubectl'" >> ~/.bashrc source ~/.bashrc # Verify installation kubectl get nodes # Should show: Ready kubectl get storageclass # Should show: microk8s-hostpath (default) kubectl get pods -A # All pods should be Running ``` ### Step 1.4: Configure kubectl Access ```bash # Create kubectl config mkdir -p ~/.kube microk8s config > ~/.kube/config chmod 600 ~/.kube/config # Test cluster connectivity kubectl cluster-info kubectl top nodes ``` ### Step 1.5: Install Helm ```bash # Install Helm 3 curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash # Verify installation helm version ``` ### Step 1.6: Configure Firewall (Optional) > **Skip this step if:** Your VPS provider already has firewall rules configured in their dashboard with ports 22, 80, 443 open. Most providers (clouding.io, Hetzner, etc.) manage this at the infrastructure level. **Required ports for Bakery-IA:** | Port | Protocol | Purpose | |------|----------|---------| | 22 | TCP | SSH access | | 80 | TCP | HTTP (Let's Encrypt ACME challenges) | | 443 | TCP | HTTPS (application access) | | 25, 465, 587 | TCP | SMTP/SMTPS (if using Mailu) | | 143, 993 | TCP | IMAP/IMAPS (if using Mailu) | **Only if using UFW on the server:** ```bash # Allow necessary ports ufw allow 22/tcp # SSH ufw allow 80/tcp # HTTP (required for Let's Encrypt) ufw allow 443/tcp # HTTPS # Enable firewall (if not already enabled) ufw enable # Verify ufw status verbose ``` --- ## Phase 2: Domain & DNS Configuration ### Step 2.1: DNS Records Configuration Add these DNS records pointing to your VPS IP (`200.234.233.87`): | Type | Name | Value | TTL | |------|------|-------|-----| | A | @ | 200.234.233.87 | Auto | | A | www | 200.234.233.87 | Auto | | A | mail | 200.234.233.87 | Auto | | A | monitoring | 200.234.233.87 | Auto | | A | gitea | 200.234.233.87 | Auto | | A | registry | 200.234.233.87 | Auto | | A | api | 200.234.233.87 | Auto | | MX | @ | mail.bakewise.ai | 10 | | TXT | @ | v=spf1 mx a -all | Auto | | TXT | _dmarc | v=DMARC1; p=reject; rua=mailto:admin@bakewise.ai | Auto | ### Step 2.2: Verify DNS Propagation ```bash # Test DNS resolution (wait 5-10 minutes after changes) dig bakewise.ai +short dig www.bakewise.ai +short dig mail.bakewise.ai +short dig gitea.bakewise.ai +short # Check MX records dig bakewise.ai MX +short # Use online tools for comprehensive check: # https://dnschecker.org/ # https://mxtoolbox.com/ ``` ### Step 2.3: Cloudflare Configuration (If Using) If using Cloudflare for DNS: 1. **SSL/TLS Mode:** Set to "Full (strict)" 2. **Proxy Status:** Set to "DNS only" (orange cloud OFF) for direct IP access 3. **Edge Certificates:** Let cert-manager handle certificates (not Cloudflare) --- ## Phase 3: Deploy Foundation Layer ### Step 3.1: Create Namespaces ```bash # Apply namespace definitions using kustomize (-k flag) kubectl apply -k infrastructure/namespaces/ # Verify kubectl get namespaces # Expected: bakery-ia, flux-system, tekton-pipelines # Alternative: Apply individual namespace files directly # kubectl apply -f infrastructure/namespaces/bakery-ia.yaml # kubectl apply -f infrastructure/namespaces/flux-system.yaml # kubectl apply -f infrastructure/namespaces/tekton-pipelines.yaml ``` ### Step 3.2: Install Cert-Manager and Deploy ClusterIssuers > **Note:** The MicroK8s `cert-manager` addon may only create the namespace without installing the actual components. Install cert-manager manually to ensure it works correctly. ```bash # Check if cert-manager pods exist kubectl get pods -n cert-manager # If no pods are running, install cert-manager manually: kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.4/cert-manager.yaml # Wait for all cert-manager pods to be ready (this may take 1-2 minutes) kubectl wait --for=condition=ready pod --all -n cert-manager --timeout=300s # Verify all 3 components are running kubectl get pods -n cert-manager # Expected output: # NAME READY STATUS RESTARTS AGE # cert-manager-xxxxxxxxxx-xxxxx 1/1 Running 0 1m # cert-manager-cainjector-xxxxxxxxxx-xxxxx 1/1 Running 0 1m # cert-manager-webhook-xxxxxxxxxx-xxxxx 1/1 Running 0 1m ``` **Deploy ClusterIssuers:** ```bash # Wait for webhook to be fully initialized sleep 10 # Apply ClusterIssuers for Let's Encrypt kubectl apply -f infrastructure/platform/cert-manager/cluster-issuer-staging.yaml kubectl apply -f infrastructure/platform/cert-manager/cluster-issuer-production.yaml # Verify ClusterIssuers are ready kubectl get clusterissuer # Expected output: # NAME READY AGE # letsencrypt-production True 1m # letsencrypt-staging True 1m ``` **If you get webhook errors:** ```bash # The webhook may need more time to initialize # Wait and retry: sleep 30 kubectl apply -f infrastructure/platform/cert-manager/cluster-issuer-staging.yaml kubectl apply -f infrastructure/platform/cert-manager/cluster-issuer-production.yaml ``` > **Note:** Common configs (secrets, configmaps) and TLS secrets are automatically included when you apply the prod kustomization in Phase 6. No manual application needed. --- ## Phase 4: Deploy CI/CD Infrastructure ### Step 4.1: Deploy Gitea (Git Server + Container Registry) ```bash # Add Gitea Helm repository helm repo add gitea https://dl.gitea.io/charts helm repo update # Generate and export admin password (REQUIRED for --production flag) export GITEA_ADMIN_PASSWORD=$(openssl rand -base64 32) echo "Gitea Admin Password: $GITEA_ADMIN_PASSWORD" echo "⚠️ SAVE THIS PASSWORD SECURELY!" # Run setup script - creates secrets and init job automatically # The script will: # 1. Create gitea namespace (if not exists) # 2. Create gitea-admin-secret in gitea namespace # 3. Create gitea-registry-secret in bakery-ia namespace # 4. Apply gitea-init-job.yaml (creates bakery-ia repo) cd /root/bakery-ia/infrastructure/cicd/gitea chmod +x setup-admin-secret.sh ./setup-admin-secret.sh --production cd /root/bakery-ia # Install Gitea with production values helm upgrade --install gitea gitea/gitea -n gitea \ -f infrastructure/cicd/gitea/values.yaml \ -f infrastructure/cicd/gitea/values-prod.yaml \ --timeout 10m \ --wait # Wait for Gitea to be ready kubectl wait --for=condition=ready pod -n gitea -l app.kubernetes.io/name=gitea --timeout=300s # Verify kubectl get pods -n gitea # Check init job status (creates bakery-ia repository) kubectl logs -n gitea -l app.kubernetes.io/component=init --tail=50 ``` ### Step 4.2: Push Repository to Gitea ```bash cd /root/bakery-ia # Fix Git ownership warning (common when using rsync as different user) git config --global --add safe.directory /root/bakery-ia # Configure git user (required for commits) git config --global user.email "admin@bakewise.ai" git config --global user.name "Bakery Admin" # Initialize repository git init # Rename branch to main (git init may create 'master' by default) git branch -m main # Add all files and commit git add . git commit -m "Initial commit - production deployment" # Add remote and push (you'll need the admin password from Step 5.1) git remote add origin https://gitea.bakewise.ai/bakery-admin/bakery-ia.git # Force push to overwrite init job's auto-generated content # This is safe for initial deployment - your local code is the source of truth git push -u origin main --force ``` ### Step 4.3: Verify Registry Secret (Already Created) > **Note:** The registry secret `gitea-registry-secret` was already created by `setup-admin-secret.sh` in Step 4.1. ```bash # Verify the registry secret exists kubectl get secret gitea-registry-secret -n bakery-ia # Expected output: # NAME TYPE DATA AGE # gitea-registry-secret kubernetes.io/dockerconfigjson 1 Xm ``` ### Step 4.4: Deploy Tekton (CI Pipelines) ```bash # Step 1: Install Tekton Pipelines (the controller) kubectl apply --filename https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml # Wait for Tekton Pipelines to be ready kubectl wait --for=condition=ready pod -l app.kubernetes.io/part-of=tekton-pipelines -n tekton-pipelines --timeout=300s # Step 2: Install Tekton Triggers (for webhooks) kubectl apply --filename https://storage.googleapis.com/tekton-releases/triggers/latest/release.yaml kubectl apply --filename https://storage.googleapis.com/tekton-releases/triggers/latest/interceptors.yaml # Wait for Tekton Triggers to be ready kubectl wait --for=condition=ready pod -l app.kubernetes.io/part-of=tekton-triggers -n tekton-pipelines --timeout=300s # Verify Tekton is installed kubectl get pods -n tekton-pipelines # Step 3: Create flux-system namespace (required by Tekton helm chart) # The Tekton chart creates a secret for Flux in this namespace kubectl create namespace flux-system --dry-run=client -o yaml | kubectl apply -f - # Step 4: Get Gitea password and generate webhook token export GITEA_ADMIN_PASSWORD=$(kubectl get secret gitea-admin-secret -n gitea -o jsonpath='{.data.password}' | base64 -d) export TEKTON_WEBHOOK_TOKEN=$(openssl rand -hex 32) echo "Tekton Webhook Token: $TEKTON_WEBHOOK_TOKEN" echo "⚠️ SAVE THIS TOKEN - needed to configure Gitea webhook!" # Step 5: Deploy Bakery-IA CI/CD pipelines and tasks helm upgrade --install tekton-cicd infrastructure/cicd/tekton-helm \ -n tekton-pipelines \ -f infrastructure/cicd/tekton-helm/values.yaml \ -f infrastructure/cicd/tekton-helm/values-prod.yaml \ --set secrets.webhook.token=$TEKTON_WEBHOOK_TOKEN \ --set secrets.registry.password=$GITEA_ADMIN_PASSWORD \ --set secrets.git.password=$GITEA_ADMIN_PASSWORD \ --timeout 5m # Verify all components kubectl get pods -n tekton-pipelines kubectl get tasks -n tekton-pipelines kubectl get pipelines -n tekton-pipelines kubectl get eventlisteners -n tekton-pipelines ``` ### Step 4.5: Deploy Flux CD (GitOps) ```bash # Step 1: Install Flux CLI (required for bootstrap) curl -s https://fluxcd.io/install.sh | sudo bash # Verify Flux CLI installation flux --version # Step 2: Install Flux components (controllers and CRDs) flux install --namespace=flux-system # Wait for Flux controllers to be ready kubectl wait --for=condition=ready pod -l app.kubernetes.io/part-of=flux -n flux-system --timeout=300s # Verify Flux controllers are running kubectl get pods -n flux-system # Step 3: Create Git credentials secret for Flux to access Gitea export GITEA_ADMIN_PASSWORD=$(kubectl get secret gitea-admin-secret -n gitea -o jsonpath='{.data.password}' | base64 -d) kubectl create secret generic gitea-credentials \ --namespace=flux-system \ --from-literal=username=bakery-admin \ --from-literal=password=$GITEA_ADMIN_PASSWORD # Step 4: Deploy Bakery-IA Flux configuration (GitRepository + Kustomization) helm upgrade --install flux-cd infrastructure/cicd/flux \ -n flux-system \ --timeout 5m # Verify Flux resources kubectl get gitrepository -n flux-system kubectl get kustomization -n flux-system # Check Flux sync status flux get sources git -n flux-system flux get kustomizations -n flux-system ``` ## Phase 5: Pre-Pull and Push Base Images to Gitea Registry > **Critical Step:** This phase must be completed after Gitea is configured (Phase 4) and before deploying application services (Phase 6). It ensures all required base images are available in the Gitea registry. ### Overview This phase involves two main steps: 1. **Step 5.1-5.4:** Pre-pull base images from Docker Hub and push them to Gitea registry 2. **Step 5.5:** Build and push all service images (first-time deployment only) ### Prerequisites: Install Docker and Create kubectl Symlink > **Important:** MicroK8s uses containerd, not Docker. You need to install Docker separately for building and pushing images. Also, scripts need `kubectl` to be available in PATH. ```bash # Step 1: Install Docker apt-get update apt-get install -y docker.io # Start and enable Docker service systemctl enable docker systemctl start docker # Verify Docker installation docker --version # Expected: Docker version 28.x.x or similar # Step 2: Create kubectl symlink (required for scripts) # MicroK8s bundles its own kubectl, but scripts need it in PATH sudo ln -sf /snap/microk8s/current/microk8s-kubectl.wrapper /usr/local/bin/kubectl # Verify kubectl works kubectl version --client ``` ### Base Images Required The following base images must be available in the Gitea registry: | Category | Image | Used By | |----------|-------|---------| | **Python Runtime** | `python:3.11-slim` | All microservices, gateway | | **Frontend Build** | `node:18-alpine` | Frontend build stage | | **Frontend Runtime** | `nginx:1.25-alpine` | Frontend production server | | **Database** | `postgres:17-alpine` | All PostgreSQL instances | | **Cache** | `redis:7.4-alpine` | Redis cache | | **Message Broker** | `rabbitmq:4.1-management-alpine` | RabbitMQ | | **Storage** | `minio/minio:RELEASE.2024-11-07T00-52-20Z` | MinIO object storage | | **CI/CD** | `gcr.io/kaniko-project/executor:v1.23.0` | Tekton image builds | --- ### Step 5.1: Pre-Pull Base Images and Push to Registry ```bash # Navigate to the scripts directory cd /root/bakery-ia/scripts # Make the script executable chmod +x prepull-base-images-for-prod.sh # Run the prepull script in production mode WITH push enabled # IMPORTANT: Use -r flag to specify the external registry URL ./prepull-base-images-for-prod.sh -e prod --push-images -r registry.bakewise.ai # The script will: # 1. Authenticate with Docker Hub (uses embedded credentials or env vars) # 2. Pull all required base images from Docker Hub/GHCR # 3. Tag them for Gitea registry (bakery-admin namespace) # 4. Push them to the Gitea container registry # 5. Report success/failure for each image ``` **Alternative: Specify Custom Registry URL** ```bash # If auto-detection fails or you need a specific registry URL: ./prepull-base-images-for-prod.sh -e prod --push-images -r registry.bakewise.ai ``` **Handle Docker Hub Rate Limits** ```bash # If you hit Docker Hub rate limits, use your own credentials: export DOCKER_HUB_USERNAME=your_username export DOCKER_HUB_PASSWORD=your_password_or_token ./prepull-base-images-for-prod.sh -e prod --push-images ``` --- ### Step 5.2: Verify Images in Gitea Registry ```bash # Get Gitea admin password export GITEA_ADMIN_PASSWORD=$(kubectl get secret gitea-admin-secret -n gitea -o jsonpath='{.data.password}' | base64 -d) # Login to Gitea registry # Note: Use registry.bakewise.ai for external access docker login registry.bakewise.ai -u bakery-admin -p $GITEA_ADMIN_PASSWORD # List all images in the registry curl -s -u bakery-admin:$GITEA_ADMIN_PASSWORD https://registry.bakewise.ai/v2/_catalog | jq # Verify specific critical images exist echo "Checking Python base image..." curl -s -u bakery-admin:$GITEA_ADMIN_PASSWORD https://registry.bakewise.ai/v2/bakery-admin/python/tags/list | jq echo "Checking Node.js base image..." curl -s -u bakery-admin:$GITEA_ADMIN_PASSWORD https://registry.bakewise.ai/v2/bakery-admin/node/tags/list | jq echo "Checking Nginx base image..." curl -s -u bakery-admin:$GITEA_ADMIN_PASSWORD https://registry.bakewise.ai/v2/bakery-admin/nginx/tags/list | jq ``` **Alternative: Verify via Gitea Web Interface** 1. Visit `https://gitea.bakewise.ai` 2. Login with username: `bakery-admin`, password: (from secret) 3. Navigate to **Packages** > **Container** 4. Verify images are listed under the `bakery-admin` namespace 5. Confirm tags match expected versions (`3.11-slim`, `18-alpine`, `1.25-alpine`, etc.) --- ### Step 5.3: Troubleshooting Image Issues **Registry Not Accessible** ```bash # Check Gitea pods are running kubectl get pods -n gitea # Check Gitea service kubectl get svc -n gitea # Check ingress for registry kubectl get ingress -n gitea # View Gitea logs for registry errors kubectl logs -n gitea -l app.kubernetes.io/name=gitea --tail=100 ``` **Images Failed to Push** ```bash # Verify Docker can reach the registry docker info | grep -i registry # Test registry connectivity curl -v https://registry.bakewise.ai/v2/ # Check for TLS certificate issues openssl s_client -connect registry.bakewise.ai:443 -servername registry.bakewise.ai ``` **Re-run Failed Images Only** ```bash # Manually pull and push a specific image docker pull python:3.11-slim docker tag python:3.11-slim registry.bakewise.ai/bakery-admin/python:3.11-slim docker push registry.bakewise.ai/bakery-admin/python:3.11-slim ``` --- ### Step 5.4: Verify CI/CD Pipeline Can Access Images ```bash # Verify gitea-registry-secret exists in bakery-ia namespace kubectl get secret gitea-registry-secret -n bakery-ia # Check the secret contains correct registry URL kubectl get secret gitea-registry-secret -n bakery-ia \ -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | jq '.auths | keys[]' # Test that Kubernetes can pull images using the secret # Create a test pod that uses the base image cat < **If this test fails:** The CI/CD pipeline and application deployments will not be able to pull images. Check: > 1. Registry URL in the secret matches your setup > 2. Credentials are correct > 3. Images were successfully pushed in Step 5.1 --- ### Step 5.5: Build and Push Service Images (First-Time Deployment Only) > **Critical:** For the first deployment, service images don't exist in the registry yet. You must build and push them before applying the production kustomization in Phase 6. #### Option A: Use the Automated Build Script (Recommended) ```bash # Navigate to the repository root cd /root/bakery-ia # Make the script executable chmod +x scripts/build-all-services.sh # Run the build script # This will build and push all 21 services to the Gitea registry ./scripts/build-all-services.sh ``` The script builds the following services: | Service | Image Name | Dockerfile | |---------|------------|------------| | Gateway | `gateway` | `gateway/Dockerfile` | | Frontend | `dashboard` | `frontend/Dockerfile.kubernetes` | | Auth | `auth-service` | `services/auth/Dockerfile` | | Tenant | `tenant-service` | `services/tenant/Dockerfile` | | Training | `training-service` | `services/training/Dockerfile` | | Forecasting | `forecasting-service` | `services/forecasting/Dockerfile` | | Sales | `sales-service` | `services/sales/Dockerfile` | | Inventory | `inventory-service` | `services/inventory/Dockerfile` | | Recipes | `recipes-service` | `services/recipes/Dockerfile` | | Suppliers | `suppliers-service` | `services/suppliers/Dockerfile` | | POS | `pos-service` | `services/pos/Dockerfile` | | Orders | `orders-service` | `services/orders/Dockerfile` | | Production | `production-service` | `services/production/Dockerfile` | | Procurement | `procurement-service` | `services/procurement/Dockerfile` | | Distribution | `distribution-service` | `services/distribution/Dockerfile` | | External | `external-service` | `services/external/Dockerfile` | | Notification | `notification-service` | `services/notification/Dockerfile` | | Orchestrator | `orchestrator-service` | `services/orchestrator/Dockerfile` | | Alert Processor | `alert-processor` | `services/alert_processor/Dockerfile` | | AI Insights | `ai-insights-service` | `services/ai_insights/Dockerfile` | | Demo Session | `demo-session-service` | `services/demo_session/Dockerfile` | #### Option B: Trigger CI/CD Pipeline If Tekton is properly configured, you can trigger the CI/CD pipeline instead: ```bash cd /root/bakery-ia # Create an empty commit to trigger the pipeline git commit --allow-empty -m "Trigger initial CI/CD build" git push origin main # Monitor pipeline execution kubectl get pipelineruns -n tekton-pipelines --watch # Wait for all builds to complete (may take 20-30 minutes) kubectl wait --for=condition=Succeeded pipelinerun --all -n tekton-pipelines --timeout=1800s ``` #### Option C: Build Individual Services Manually ```bash # Get credentials export GITEA_ADMIN_PASSWORD=$(kubectl get secret gitea-admin-secret -n gitea -o jsonpath='{.data.password}' | base64 -d) export REGISTRY="registry.bakewise.ai/bakery-admin" # Login to registry docker login registry.bakewise.ai -u bakery-admin -p $GITEA_ADMIN_PASSWORD # Build and push a single service (example: auth-service) docker build -t $REGISTRY/auth-service:latest \ --build-arg BASE_REGISTRY=$REGISTRY \ --build-arg PYTHON_IMAGE=python:3.11-slim \ -f services/auth/Dockerfile . docker push $REGISTRY/auth-service:latest # Build and push frontend docker build -t $REGISTRY/dashboard:latest \ -f frontend/Dockerfile.kubernetes frontend/ docker push $REGISTRY/dashboard:latest ``` --- ### Step 5.6: Verify All Service Images Are Available ```bash # Get Gitea admin password export GITEA_ADMIN_PASSWORD=$(kubectl get secret gitea-admin-secret -n gitea -o jsonpath='{.data.password}' | base64 -d) # List all images in the registry echo "=== Images in Gitea Registry ===" curl -s -u bakery-admin:$GITEA_ADMIN_PASSWORD https://registry.bakewise.ai/v2/_catalog | jq -r '.repositories[]' | sort # Verify critical service images exist for service in gateway dashboard auth-service tenant-service forecasting-service; do echo -n "Checking $service... " if curl -s -u bakery-admin:$GITEA_ADMIN_PASSWORD \ "https://registry.bakewise.ai/v2/bakery-admin/$service/tags/list" | jq -e '.tags' > /dev/null 2>&1; then echo "✅ OK" else echo "❌ MISSING" fi done ``` > **Ready for Phase 6:** Once all service images are verified in the registry, you can proceed to Phase 6: Deploy Application Services. ## Phase 6: Deploy Application Services > **Prerequisite:** This phase assumes that all service images have been built and pushed to the Gitea registry (completed in Phase 5, Step 5.5). The production kustomization references these pre-built images. ### Step 6.1: Apply Production Certificate ```bash # Apply the production TLS certificate kubectl apply -f infrastructure/environments/prod/k8s-manifests/prod-certificate.yaml # Verify certificate is issued kubectl get certificate -n bakery-ia kubectl describe certificate bakery-ia-prod-tls-cert -n bakery-ia ``` ### Step 6.2: Deploy Application with Kustomize ```bash # Apply the complete production configuration kubectl apply -k infrastructure/environments/prod/k8s-manifests # Wait for all deployments to be ready (10-15 minutes) kubectl wait --for=condition=available --timeout=900s deployment --all -n bakery-ia # Monitor deployment progress kubectl get pods -n bakery-ia --watch # if fails # From your Mac rsync -avz --progress --delete \ --exclude='.git' \ --exclude='node_modules' \ --exclude='__pycache__' \ --exclude='.venv' \ /Users/urtzialfaro/Documents/bakery-ia/ \ bakery-vps:/root/bakery-ia/ # On the VPS kubectl delete deployments --all -n bakery-ia kubectl delete jobs --all -n bakery-ia kubectl delete statefulsets --all -n bakery-ia sleep 30 kubectl apply -k infrastructure/environments/prod/k8s-manifests kubectl get pods -n bakery-ia -w kubectl get pods -n bakery-ia kubectl describe node | grep -A 10 "Allocated resources" ``` ### Step 6.3: Verify Application Health ```bash # Check all pods are running kubectl get pods -n bakery-ia # Check services kubectl get svc -n bakery-ia # Check ingress kubectl get ingress -n bakery-ia # Test gateway health kubectl exec -n bakery-ia deployment/gateway -- curl -s http://localhost:8000/health ``` --- ## Phase 7: Deploy Optional Services ### Step 7.1: Deploy Unbound DNS (Required for Mailu) ```bash # Deploy Unbound DNS resolver helm upgrade --install unbound infrastructure/platform/networking/dns/unbound-helm \ -n bakery-ia \ -f infrastructure/platform/networking/dns/unbound-helm/values.yaml \ -f infrastructure/platform/networking/dns/unbound-helm/prod/values.yaml \ --timeout 5m \ --wait # Get Unbound service IP UNBOUND_IP=$(kubectl get svc unbound-dns -n bakery-ia -o jsonpath='{.spec.clusterIP}') echo "Unbound DNS IP: $UNBOUND_IP" ``` ### Step 7.2: Configure CoreDNS for DNSSEC ```bash # Patch CoreDNS to forward to Unbound kubectl patch configmap coredns -n kube-system --type merge -p "{ \"data\": { \"Corefile\": \".:53 {\\n errors\\n health {\\n lameduck 5s\\n }\\n ready\\n kubernetes cluster.local in-addr.arpa ip6.arpa {\\n pods insecure\\n fallthrough in-addr.arpa ip6.arpa\\n ttl 30\\n }\\n prometheus :9153\\n forward . $UNBOUND_IP {\\n max_concurrent 1000\\n }\\n cache 30\\n loop\\n reload\\n loadbalance\\n}\\n\" } }" # Restart CoreDNS kubectl rollout restart deployment coredns -n kube-system kubectl rollout status deployment coredns -n kube-system --timeout=60s ``` ### Step 7.3: Deploy Mailu Email Server ```bash # Add Mailu Helm repository helm repo add mailu https://mailu.github.io/helm-charts helm repo update # Apply Mailu configuration secrets # These are pre-configured with secure defaults kubectl apply -f infrastructure/platform/mail/mailu-helm/configs/mailu-admin-credentials-secret.yaml -n bakery-ia kubectl apply -f infrastructure/platform/mail/mailu-helm/configs/mailu-certificates-secret.yaml -n bakery-ia # Install Mailu with production configuration # The Helm chart uses the pre-configured secrets for admin credentials and TLS certificates helm upgrade --install mailu mailu/mailu \ -n bakery-ia \ -f infrastructure/platform/mail/mailu-helm/values.yaml \ -f infrastructure/platform/mail/mailu-helm/prod/values.yaml \ --timeout 10m # Wait for Mailu to be ready kubectl wait --for=condition=available --timeout=600s deployment/mailu-front -n bakery-ia # Verify Mailu pods are running kubectl get pods -n bakery-ia | grep mailu # Get the admin password from the pre-configured secret MAILU_ADMIN_PASSWORD=$(kubectl get secret mailu-admin-credentials -n bakery-ia -o jsonpath='{.data.password}' | base64 -d) echo "Mailu Admin Password: $MAILU_ADMIN_PASSWORD" echo "⚠️ SAVE THIS PASSWORD SECURELY!" # Check Mailu initialization status kubectl logs -n bakery-ia deployment/mailu-front --tail=10 ``` > **Important Notes about Mailu Deployment:** > > 1. **Pre-Configured Secrets:** Mailu uses pre-configured secrets for admin credentials and TLS certificates. These are defined in the configuration files. > > 2. **Password Management:** The admin password is stored in `mailu-admin-credentials-secret.yaml`. For production, you should update this with a secure password before deployment. > > 3. **TLS Certificates:** The self-signed certificates in `mailu-certificates-secret.yaml` are for initial setup. For production, replace these with proper certificates from cert-manager (see Step 7.3.1). > > 4. **Initialization Time:** Mailu may take 5-10 minutes to fully initialize. During this time, some pods may restart as the system configures itself. > > 5. **Accessing Mailu:** > - Webmail: `https://mail.bakewise.ai/webmail` > - Admin Interface: `https://mail.bakewise.ai/admin` > - Username: `admin@bakewise.ai` > - Password: (from `mailu-admin-credentials-secret.yaml`) > > 6. **Mailgun Relay:** The production configuration includes Mailgun SMTP relay. Configure your Mailgun credentials in `mailu-mailgun-credentials-secret.yaml` before deployment. ### Step 7.3.1: Mailu Configuration Notes > **Important Information about Mailu Certificates:** > > 1. **Dual Certificate Architecture:** > - **Internal Communication:** Uses self-signed certificates (`mailu-certificates-secret.yaml`) > - **External Communication:** Uses Let's Encrypt certificates via NGINX Ingress (`bakery-ia-prod-tls-cert`) > > 2. **No Certificate Replacement Needed:** The self-signed certificates are only used for internal communication between Mailu services. External clients connect through the NGINX Ingress Controller which uses the publicly trusted Let's Encrypt certificates. > > 3. **Certificate Flow:** > ``` > External Client → NGINX Ingress (Let's Encrypt) → Internal Network → Mailu Services (Self-signed) > ``` > > 4. **Security:** This architecture is secure because: > - External connections use publicly trusted certificates > - Internal connections are still encrypted (even if self-signed) > - Ingress terminates TLS, reducing load on Mailu services > > 5. **Mailgun Relay Configuration:** For outbound email delivery, configure your Mailgun credentials: > ```bash > # Edit the Mailgun credentials secret > nano infrastructure/platform/mail/mailu-helm/configs/mailu-mailgun-credentials-secret.yaml > > # Apply the secret > kubectl apply -f infrastructure/platform/mail/mailu-helm/configs/mailu-mailgun-credentials-secret.yaml -n bakery-ia > > # Restart Mailu to pick up the new relay configuration > kubectl rollout restart deployment -n bakery-ia -l app.kubernetes.io/instance=mailu > ``` ### Step 7.4: Deploy SigNoz Monitoring ```bash # Add SigNoz Helm repository helm repo add signoz https://charts.signoz.io helm repo update # Install SigNoz helm install signoz signoz/signoz \ -n bakery-ia \ -f infrastructure/monitoring/signoz/signoz-values-prod.yaml \ --set global.storageClass="microk8s-hostpath" \ --set clickhouse.persistence.enabled=true \ --set clickhouse.persistence.size=50Gi \ --timeout 15m # Wait for SigNoz to be ready kubectl wait --for=condition=available --timeout=600s deployment/signoz-frontend -n bakery-ia # Verify kubectl get pods -n bakery-ia -l app.kubernetes.io/instance=signoz ``` --- ## Phase 8: Verification & Validation ### Step 8.1: Complete Verification Checklist ```bash # 1. Check all pods are running kubectl get pods -n bakery-ia | grep -vE "Running|Completed" # Should return NO results # 2. Check services kubectl get svc -n bakery-ia # 3. Check ingress kubectl get ingress -n bakery-ia # 4. Check certificates kubectl get certificate -n bakery-ia kubectl describe certificate bakery-ia-prod-tls-cert -n bakery-ia # 5. Check PVCs kubectl get pvc -n bakery-ia ``` ### Step 8.2: Test Application Endpoints ```bash # Test frontend (from external machine) curl -I https://bakewise.ai # Expected: HTTP/2 200 OK # Test API health curl https://bakewise.ai/api/v1/health # Expected: {"status": "healthy"} # Test monitoring curl -I https://monitoring.bakewise.ai/signoz # Expected: HTTP/2 200 OK ``` ### Step 8.3: Test Database Connections ```bash # Test PostgreSQL SSL kubectl exec -n bakery-ia deployment/auth-db -- sh -c \ 'psql -U auth_user -d auth_db -c "SHOW ssl;"' # Expected: on # Test Redis kubectl exec -n bakery-ia deployment/redis -- redis-cli ping # Expected: PONG ``` ### Step 8.4: Production Validation Checklist - [ ] Application accessible at `https://bakewise.ai` - [ ] Monitoring accessible at `https://monitoring.bakewise.ai` - [ ] SSL certificates valid (check with browser) - [ ] All services running and healthy - [ ] Database connections working with TLS - [ ] CI/CD pipeline operational - [ ] Email service working (if deployed) - [ ] Pilot coupon verified (check tenant-service logs) --- ## Post-Deployment Operations ### Configure Stripe Keys (Required Before Going Live) Before accepting payments, configure your Stripe credentials: ```bash # Edit ConfigMap for publishable key nano infrastructure/environments/common/configs/configmap.yaml # Add: VITE_STRIPE_PUBLISHABLE_KEY: "pk_live_XXXXXXXXXXXX" # Encode your secret keys echo -n "sk_live_XXXXXXXXXX" | base64 # Your secret key echo -n "whsec_XXXXXXXXXX" | base64 # Your webhook secret # Edit Secrets nano infrastructure/environments/common/configs/secrets.yaml # Add to payment-secrets section: # STRIPE_SECRET_KEY: # STRIPE_WEBHOOK_SECRET: # Apply the updated configuration kubectl apply -k infrastructure/environments/prod/k8s-manifests # Restart services that use Stripe kubectl rollout restart deployment/payment-service -n bakery-ia ``` ### Backup Strategy ```bash # Create backup script cat > ~/backup-databases.sh << 'EOF' #!/bin/bash BACKUP_DIR="/backups/$(date +%Y-%m-%d)" mkdir -p $BACKUP_DIR # Backup all databases for db in auth tenant training forecasting ai-insights sales inventory production procurement distribution recipes suppliers pos orders external notification alert-processor orchestrator demo-session; do echo "Backing up ${db}-db..." kubectl exec -n bakery-ia deployment/${db}-db -- \ pg_dump -U ${db}_user -d ${db}_db > "$BACKUP_DIR/${db}.sql" done # Compress tar -czf "$BACKUP_DIR.tar.gz" "$BACKUP_DIR" rm -rf "$BACKUP_DIR" # Keep only last 7 days find /backups -name "*.tar.gz" -mtime +7 -delete echo "Backup completed: $BACKUP_DIR.tar.gz" EOF chmod +x ~/backup-databases.sh # Setup daily cron job (2 AM) (crontab -l 2>/dev/null; echo "0 2 * * * ~/backup-databases.sh") | crontab - ``` ### Scaling Guidelines | Tenants | RAM | CPU | Storage | Monthly Cost | |---------|-----|-----|---------|--------------| | 10 | 20 GB | 8 cores | 200 GB | €40-80 | | 25 | 32 GB | 12 cores | 300 GB | €80-120 | | 50 | 48 GB | 16 cores | 500 GB | €150-200 | | 100+ | Consider multi-node cluster | | | €300+ | ### Regular Maintenance Tasks | Frequency | Task | |-----------|------| | Daily | Check logs and alerts | | Weekly | Review resource utilization | | Monthly | Update dependencies, security patches | | Quarterly | Review backup procedures, disaster recovery | --- ## Troubleshooting Guide ### Common Issues #### Pods Stuck in Pending State ```bash # Check node resources kubectl describe nodes # Check PVC status kubectl get pvc -n bakery-ia # Check events kubectl get events -n bakery-ia --sort-by='.lastTimestamp' ``` #### Certificate Not Issuing ```bash # Check cluster issuer kubectl get clusterissuer # Check certificate status kubectl describe certificate -n bakery-ia # Check cert-manager logs kubectl logs -n cert-manager deployment/cert-manager # Verify ports 80/443 are open curl -I http://bakewise.ai ``` #### Services Not Accessible ```bash # Check ingress kubectl describe ingress -n bakery-ia # Check ingress controller logs kubectl logs -n ingress deployment/nginx-ingress-microk8s-controller # Check endpoints kubectl get endpoints -n bakery-ia ``` #### Database Connection Errors ```bash # Check database pod kubectl get pods -n bakery-ia -l app.kubernetes.io/component=database # Check database logs kubectl logs -n bakery-ia deployment/auth-db # Test connection from service kubectl exec -n bakery-ia deployment/auth-service -- nc -zv auth-db 5432 ``` #### Out of Resources ```bash # Check node resources kubectl top nodes # Check pod resource usage kubectl top pods -n bakery-ia --sort-by=memory # Scale down non-critical services temporarily kubectl scale deployment monitoring -n bakery-ia --replicas=0 ``` --- ## Reference & Resources ### Key File Locations | Configuration | File Path | |---------------|-----------| | ConfigMap | `infrastructure/environments/common/configs/configmap.yaml` | | Secrets | `infrastructure/environments/common/configs/secrets.yaml` | | Prod Kustomization | `infrastructure/environments/prod/k8s-manifests/kustomization.yaml` | | Cert-Manager Issuer | `infrastructure/platform/cert-manager/cluster-issuer-production.yaml` | | Ingress | `infrastructure/platform/networking/ingress/base/ingress.yaml` | | Gitea Values | `infrastructure/cicd/gitea/values.yaml` | | Mailu Values | `infrastructure/platform/mail/mailu-helm/values.yaml` | ### Production URLs | Service | URL | |---------|-----| | Main Application | https://bakewise.ai | | API | https://bakewise.ai/api/v1/... | | Monitoring | https://monitoring.bakewise.ai | | Gitea | https://gitea.bakewise.ai | | Registry | https://registry.bakewise.ai | | Webmail | https://mail.bakewise.ai/webmail | | Mail Admin | https://mail.bakewise.ai/admin | ### External Documentation - [MicroK8s Documentation](https://microk8s.io/docs) - [Kubernetes Documentation](https://kubernetes.io/docs) - [Let's Encrypt Documentation](https://letsencrypt.org/docs) - [SigNoz Documentation](https://signoz.io/docs/) - [OpenTelemetry Documentation](https://opentelemetry.io/docs/) ### Support Resources - **Operations Guide:** [PRODUCTION_OPERATIONS_GUIDE.md](./docs/PRODUCTION_OPERATIONS_GUIDE.md) - **Pilot Launch Guide:** [PILOT_LAUNCH_GUIDE.md](./docs/PILOT_LAUNCH_GUIDE.md) - **Infrastructure README:** [infrastructure/README.md](./infrastructure/README.md) --- ## Conclusion This guide provides a complete, step-by-step process for deploying Bakery-IA to production. Key highlights: 1. **Bootstrap Approach:** Transfer code to server first, then push to Gitea 2. **Layered Deployment:** Components deployed in dependency order 3. **Production Ready:** TLS everywhere, monitoring, CI/CD, backups 4. **Scalable:** Designed for 10-100+ tenants with clear scaling path For questions or issues, refer to the troubleshooting guide or consult the support resources listed above.