3504 lines
112 KiB
Markdown
3504 lines
112 KiB
Markdown
# Bakery-IA Pilot Launch Guide
|
||
|
||
**Complete guide for deploying to production for a 10-tenant pilot program**
|
||
|
||
**Last Updated:** 2026-01-20
|
||
**Target Environment:** clouding.io VPS with MicroK8s
|
||
**Estimated Cost:** €41-81/month
|
||
**Time to Deploy:** 3-5 hours (first time, including fixes)
|
||
**Status:** ⚠️ REQUIRES PRE-DEPLOYMENT FIXES - See [Production VPS Deployment Fixes](../PRODUCTION_VPS_DEPLOYMENT_FIXES.md)
|
||
**Version:** 3.0
|
||
|
||
---
|
||
|
||
## Table of Contents
|
||
|
||
1. [Executive Summary](#executive-summary)
|
||
2. [Infrastructure Architecture Overview](#infrastructure-architecture-overview)
|
||
3. [⚠️ CRITICAL: Pre-Deployment Fixes](#critical-pre-deployment-fixes)
|
||
4. [Pre-Launch Checklist](#pre-launch-checklist)
|
||
5. [VPS Provisioning](#vps-provisioning)
|
||
6. [Infrastructure Setup](#infrastructure-setup)
|
||
7. [Domain & DNS Configuration](#domain--dns-configuration)
|
||
8. [TLS/SSL Certificates](#tlsssl-certificates)
|
||
9. [Email & Communication Setup](#email--communication-setup)
|
||
10. [Kubernetes Deployment](#kubernetes-deployment)
|
||
11. [Configuration & Secrets](#configuration--secrets)
|
||
12. [Database Migrations](#database-migrations)
|
||
13. [CI/CD Infrastructure Deployment](#cicd-infrastructure-deployment)
|
||
14. [Mailu Email Server Deployment](#mailu-email-server-deployment)
|
||
15. [Nominatim Geocoding Service](#nominatim-geocoding-service)
|
||
16. [SigNoz Monitoring Deployment](#signoz-monitoring-deployment)
|
||
17. [Verification & Testing](#verification--testing)
|
||
18. [Post-Deployment](#post-deployment)
|
||
|
||
---
|
||
|
||
## Executive Summary
|
||
|
||
### What You're Deploying
|
||
|
||
A complete multi-tenant SaaS platform with:
|
||
- **18 microservices** (auth, tenant, ML forecasting, inventory, sales, orders, etc.)
|
||
- **14 PostgreSQL databases** with TLS encryption
|
||
- **Redis cache** with TLS
|
||
- **RabbitMQ** message broker
|
||
- **Monitoring stack** (Prometheus, Grafana, AlertManager)
|
||
- **Full security** (TLS, RBAC, audit logging)
|
||
|
||
### Total Cost Breakdown
|
||
|
||
| Service | Provider | Monthly Cost |
|
||
|---------|----------|-------------|
|
||
| VPS Server (20GB RAM, 8 vCPU, 200GB SSD) | clouding.io | €40-80 |
|
||
| Domain | Namecheap/Cloudflare | €1.25 (€15/year) |
|
||
| Email | Zoho Free / Gmail | €0 |
|
||
| WhatsApp API | Meta Business | €0 (1k free conversations) |
|
||
| DNS | Cloudflare | €0 |
|
||
| SSL | Let's Encrypt | €0 |
|
||
| **TOTAL** | | **€41-81/month** |
|
||
|
||
### Timeline
|
||
|
||
| Phase | Duration | Description |
|
||
|-------|----------|-------------|
|
||
| Pre-Launch Setup | 1-2 hours | Domain, VPS provisioning, accounts setup |
|
||
| Infrastructure Setup | 1 hour | MicroK8s installation, firewall config |
|
||
| Deployment | 30-60 min | Deploy all services and databases |
|
||
| Verification | 30-60 min | Test everything works |
|
||
| **Total** | **2-4 hours** | First-time deployment |
|
||
|
||
---
|
||
|
||
## Infrastructure Architecture Overview
|
||
|
||
### Component Layers
|
||
|
||
The Bakery-IA platform is organized into distinct infrastructure layers, each with specific deployment dependencies.
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||
│ LAYER 6: APPLICATION │
|
||
│ Frontend │ Gateway │ 18 Microservices │ CronJobs & Workers │
|
||
├─────────────────────────────────────────────────────────────────────────────┤
|
||
│ LAYER 5: MONITORING │
|
||
│ SigNoz (Unified Observability) │ AlertManager │ OTel Collector │
|
||
├─────────────────────────────────────────────────────────────────────────────┤
|
||
│ LAYER 4: PLATFORM SERVICES (Optional) │
|
||
│ Mailu (Email) │ Nominatim (Geocoding) │ CI/CD (Tekton, Flux, Gitea) │
|
||
├─────────────────────────────────────────────────────────────────────────────┤
|
||
│ LAYER 3: DATA & STORAGE │
|
||
│ PostgreSQL (18 DBs) │ Redis │ RabbitMQ │ MinIO │
|
||
├─────────────────────────────────────────────────────────────────────────────┤
|
||
│ LAYER 2: NETWORK & SECURITY │
|
||
│ Unbound DNS │ CoreDNS │ Ingress Controller │ Cert-Manager │ TLS │
|
||
├─────────────────────────────────────────────────────────────────────────────┤
|
||
│ LAYER 1: FOUNDATION │
|
||
│ Namespaces │ Storage Classes │ RBAC │ ConfigMaps │ Secrets │
|
||
├─────────────────────────────────────────────────────────────────────────────┤
|
||
│ LAYER 0: KUBERNETES CLUSTER │
|
||
│ MicroK8s (Production) │ Kind (Local Dev) │ EKS (AWS Alternative) │
|
||
└─────────────────────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
### Deployment Order & Dependencies
|
||
|
||
Components must be deployed in a specific order due to dependencies:
|
||
|
||
```
|
||
1. Namespaces (bakery-ia, tekton-pipelines, flux-system)
|
||
↓
|
||
2. Cert-Manager & ClusterIssuers
|
||
↓
|
||
3. TLS Certificates (internal + ingress)
|
||
↓
|
||
4. Unbound DNS Resolver (required for Mailu DNSSEC)
|
||
↓
|
||
5. CoreDNS Configuration (forward to Unbound)
|
||
↓
|
||
6. Ingress Controller & Resources
|
||
↓
|
||
7. Data Layer: PostgreSQL, Redis, RabbitMQ, MinIO
|
||
↓
|
||
8. Database Migrations
|
||
↓
|
||
9. Application Services (18 microservices)
|
||
↓
|
||
10. Gateway & Frontend
|
||
↓
|
||
11. (Optional) CI/CD: Gitea → Tekton → Flux
|
||
↓
|
||
12. (Optional) Mailu Email Server
|
||
↓
|
||
13. (Optional) Nominatim Geocoding
|
||
↓
|
||
14. (Optional) SigNoz Monitoring
|
||
```
|
||
|
||
### Infrastructure Components Summary
|
||
|
||
| Component | Purpose | Required | Namespace |
|
||
|-----------|---------|----------|-----------|
|
||
| **MicroK8s** | Kubernetes cluster | Yes | - |
|
||
| **Cert-Manager** | TLS certificate management | Yes | cert-manager |
|
||
| **Ingress-Nginx** | External traffic routing | Yes | ingress |
|
||
| **PostgreSQL** | 18 service databases | Yes | bakery-ia |
|
||
| **Redis** | Caching & sessions | Yes | bakery-ia |
|
||
| **RabbitMQ** | Message broker | Yes | bakery-ia |
|
||
| **MinIO** | Object storage (ML models) | Yes | bakery-ia |
|
||
| **Unbound DNS** | DNSSEC resolver | For Mailu | bakery-ia |
|
||
| **Mailu** | Self-hosted email server | Optional | bakery-ia |
|
||
| **Nominatim** | Geocoding service | Optional | bakery-ia |
|
||
| **Gitea** | Git server + container registry | Optional | gitea |
|
||
| **Tekton** | CI/CD pipelines | Optional | tekton-pipelines |
|
||
| **Flux CD** | GitOps deployment | Optional | flux-system |
|
||
| **SigNoz** | Unified observability | Recommended | bakery-ia |
|
||
|
||
### Quick Reference: What to Deploy
|
||
|
||
**Minimal Production Setup:**
|
||
- Kubernetes cluster + addons
|
||
- Core infrastructure (databases, cache, broker)
|
||
- Application services
|
||
- External email (Zoho/Gmail)
|
||
|
||
**Full Production Setup (Recommended):**
|
||
- Everything above, plus:
|
||
- Mailu (self-hosted email)
|
||
- SigNoz (monitoring)
|
||
- CI/CD (Gitea + Tekton + Flux)
|
||
- Nominatim (if geocoding needed)
|
||
|
||
---
|
||
|
||
## ⚠️ CRITICAL: Pre-Deployment Configuration
|
||
|
||
**READ THIS FIRST:** Review and complete these configuration steps before deploying to production.
|
||
|
||
### Infrastructure Architecture (Updated)
|
||
|
||
The infrastructure has been reorganized with the following structure:
|
||
|
||
```
|
||
infrastructure/
|
||
├── environments/ # Environment-specific configs
|
||
│ ├── common/configs/ # Shared ConfigMaps and Secrets
|
||
│ │ ├── configmap.yaml # Application configuration
|
||
│ │ ├── secrets.yaml # All secrets (database, JWT, Redis, etc.)
|
||
│ │ └── kustomization.yaml
|
||
│ ├── dev/k8s-manifests/ # Development Kustomization
|
||
│ └── prod/k8s-manifests/ # Production Kustomization & patches
|
||
├── platform/ # Platform-level infrastructure
|
||
│ ├── cert-manager/ # TLS certificate issuers (Let's Encrypt)
|
||
│ ├── networking/ingress/ # NGINX ingress (base + overlays)
|
||
│ ├── storage/ # PostgreSQL, Redis, MinIO
|
||
│ ├── gateway/ # API Gateway service
|
||
│ └── mail/mailu-helm/ # Email server (Helm chart)
|
||
├── services/ # Application services
|
||
│ ├── databases/ # 19 PostgreSQL database instances
|
||
│ └── microservices/ # 19 microservices
|
||
├── cicd/ # CI/CD (deployed via Helm, NOT kustomize)
|
||
│ ├── gitea/ # Git server + container registry
|
||
│ ├── tekton-helm/ # CI pipelines
|
||
│ └── flux/ # GitOps deployment
|
||
└── monitoring/signoz/ # SigNoz observability (via Helm)
|
||
```
|
||
|
||
### 🔴 Configuration Status
|
||
|
||
| Item | Status | File Location |
|
||
|------|--------|---------------|
|
||
| Production Secrets | ✅ Configured | `infrastructure/environments/common/configs/secrets.yaml` |
|
||
| Cert-Manager Email | ✅ Configured | `infrastructure/platform/cert-manager/cluster-issuer-production.yaml` |
|
||
| SigNoz Namespace | ✅ Uses bakery-ia | `infrastructure/environments/prod/k8s-manifests/kustomization.yaml` |
|
||
| imagePullSecrets | ✅ Auto-patched | Production kustomization adds `gitea-registry-secret` automatically |
|
||
| Image Tags | ⚠️ Update for releases | `infrastructure/environments/prod/k8s-manifests/kustomization.yaml` |
|
||
| Stripe Keys | ⚠️ Configure before launch | ConfigMap + Secrets |
|
||
| Pilot Coupon | ✅ Auto-seeded | `app/jobs/startup_seeder.py` |
|
||
|
||
### Required Configuration Changes
|
||
|
||
#### 1. imagePullSecrets - ✅ **AUTOMATICALLY HANDLED**
|
||
**Status:** ✅ The production kustomization automatically patches all workloads
|
||
**File:** `infrastructure/environments/prod/k8s-manifests/kustomization.yaml`
|
||
|
||
The production overlay adds `gitea-registry-secret` to all Deployments, StatefulSets, Jobs, and CronJobs via Kustomize patches:
|
||
```yaml
|
||
patches:
|
||
- target:
|
||
kind: Deployment
|
||
patch: |-
|
||
- op: add
|
||
path: /spec/template/spec/imagePullSecrets
|
||
value:
|
||
- name: gitea-registry-secret
|
||
```
|
||
|
||
**Note:** The `gitea-registry-secret` is created by `infrastructure/cicd/gitea/sync-registry-secret.sh` after Gitea deployment.
|
||
|
||
#### 2. Update Image Tags to Semantic Versions (FOR RELEASES)
|
||
**Why:** Using 'latest' causes non-deterministic deployments
|
||
**Impact if skipped:** Unpredictable behavior, impossible rollbacks
|
||
**File:** `infrastructure/environments/prod/k8s-manifests/kustomization.yaml`
|
||
|
||
For production releases, update the `images:` section from `latest` to semantic versions.
|
||
|
||
#### 3. Production Secrets - ✅ **ALREADY CONFIGURED**
|
||
**Status:** ✅ Strong production secrets have been pre-generated
|
||
**File:** `infrastructure/environments/common/configs/secrets.yaml`
|
||
|
||
Pre-configured secrets include:
|
||
- **19 database passwords** (24-character URL-safe random strings)
|
||
- **JWT secrets** (256-bit cryptographically secure)
|
||
- **Redis password** (24-character random string)
|
||
- **RabbitMQ credentials**
|
||
- **PostgreSQL monitoring user** for SigNoz metrics collection
|
||
|
||
#### 4. Cert-Manager Email - ✅ **ALREADY CONFIGURED**
|
||
**Status:** ✅ Email set to `admin@bakewise.ai`
|
||
**File:** `infrastructure/platform/cert-manager/cluster-issuer-production.yaml`
|
||
|
||
#### 5. Update Stripe Keys (HIGH PRIORITY)
|
||
**Why:** Payment processing requires production Stripe keys
|
||
**Impact if skipped:** Payments will use test mode (no real charges)
|
||
|
||
**ConfigMap** (`infrastructure/environments/common/configs/configmap.yaml`):
|
||
```yaml
|
||
VITE_STRIPE_PUBLISHABLE_KEY: "pk_live_XXXXXXXXXXXXXXXXXXXX"
|
||
```
|
||
|
||
**Secrets** (`infrastructure/environments/common/configs/secrets.yaml`):
|
||
```yaml
|
||
# Add to payment-secrets section (base64 encoded)
|
||
STRIPE_SECRET_KEY: <base64-encoded-secret-key>
|
||
STRIPE_WEBHOOK_SECRET: <base64-encoded-webhook-secret>
|
||
```
|
||
|
||
Get your keys from: https://dashboard.stripe.com/apikeys
|
||
|
||
#### 6. Pilot Coupon Configuration - ✅ **AUTO-SEEDED**
|
||
**Status:** ✅ Automatically created when tenant-service starts
|
||
**How it works:** `app/jobs/startup_seeder.py` creates the PILOT2025 coupon
|
||
|
||
Default pilot settings (in configmap, can be customized):
|
||
- `VITE_PILOT_MODE_ENABLED: "true"` - Enables pilot UI features
|
||
- `VITE_PILOT_COUPON_CODE: "PILOT2025"` - Coupon code for 3 months free
|
||
- `VITE_PILOT_TRIAL_MONTHS: "3"` - Trial extension duration
|
||
|
||
### ✅ Already Correct (No Changes Needed)
|
||
|
||
- **Storage Class** - Uses MicroK8s default storage provisioner
|
||
- **Domain Names** - `bakewise.ai` configured in production overlay
|
||
- **Service Types** - ClusterIP + Ingress is correct architecture
|
||
- **Network Policies** - Defined in `infrastructure/platform/security/network-policies/`
|
||
- **SigNoz Namespace** - ✅ Uses `bakery-ia` namespace (unified with application)
|
||
- **OTEL Configuration** - ✅ Pre-configured for SigNoz in production patches
|
||
- **Replica Counts** - ✅ Production replicas defined in kustomization (2-3 per service)
|
||
|
||
### Step-by-Step Configuration Script
|
||
|
||
Run these commands on your **local machine** before deployment:
|
||
|
||
```bash
|
||
# Navigate to repository root
|
||
cd /path/to/bakery-ia
|
||
|
||
# ========================================
|
||
# STEP 1: Verify Infrastructure Structure
|
||
# ========================================
|
||
echo "Step 1: Verifying new infrastructure structure..."
|
||
echo "Checking directories..."
|
||
ls -d infrastructure/environments/common/configs/ && echo "✅ Common configs"
|
||
ls -d infrastructure/environments/prod/k8s-manifests/ && echo "✅ Prod kustomization"
|
||
ls -d infrastructure/platform/cert-manager/ && echo "✅ Cert-manager"
|
||
ls -d infrastructure/cicd/gitea/ && echo "✅ Gitea CI/CD"
|
||
|
||
# ========================================
|
||
# STEP 2: Update Image Tags (for releases)
|
||
# ========================================
|
||
echo -e "\nStep 2: Updating image tags for release..."
|
||
export VERSION="1.0.0" # Change this to your version
|
||
|
||
# Update application image tags in production kustomization
|
||
sed -i.bak "s/newTag: latest/newTag: v${VERSION}/g" \
|
||
infrastructure/environments/prod/k8s-manifests/kustomization.yaml
|
||
|
||
# Verify (show first 10 image entries)
|
||
echo "Current image tags:"
|
||
grep -A 1 "name: bakery/" infrastructure/environments/prod/k8s-manifests/kustomization.yaml | head -20
|
||
|
||
# ========================================
|
||
# STEP 3: Verify Production Secrets
|
||
# ========================================
|
||
echo -e "\nStep 3: Verifying production secrets..."
|
||
echo "✅ Production secrets are pre-configured with strong passwords:"
|
||
echo " - 19 database passwords (24-char URL-safe random)"
|
||
echo " - JWT secrets (256-bit cryptographically secure)"
|
||
echo " - Redis password (24-char random)"
|
||
echo " - RabbitMQ credentials"
|
||
echo " - PostgreSQL monitoring user for SigNoz"
|
||
echo ""
|
||
echo "Location: infrastructure/environments/common/configs/secrets.yaml"
|
||
|
||
# Quick verification
|
||
grep -c "_DB_PASSWORD:" infrastructure/environments/common/configs/secrets.yaml
|
||
echo "database password entries found"
|
||
|
||
# ========================================
|
||
# STEP 4: Verify Cert-Manager Email
|
||
# ========================================
|
||
echo -e "\nStep 4: Verifying cert-manager email..."
|
||
grep "email:" infrastructure/platform/cert-manager/cluster-issuer-production.yaml
|
||
# Should show: email: admin@bakewise.ai
|
||
|
||
# ========================================
|
||
# STEP 5: Verify imagePullSecrets Patch
|
||
# ========================================
|
||
echo -e "\nStep 5: Verifying imagePullSecrets configuration..."
|
||
grep -A 5 "gitea-registry-secret" infrastructure/environments/prod/k8s-manifests/kustomization.yaml && \
|
||
echo "✅ imagePullSecrets patch configured" || \
|
||
echo "⚠️ WARNING: imagePullSecrets patch missing"
|
||
|
||
# ========================================
|
||
# STEP 6: Configure Stripe Keys (MANUAL)
|
||
# ========================================
|
||
echo -e "\nStep 6: Stripe Configuration..."
|
||
echo "================================================================"
|
||
echo "⚠️ MANUAL STEP REQUIRED"
|
||
echo ""
|
||
echo "1. Edit ConfigMap:"
|
||
echo " File: infrastructure/environments/common/configs/configmap.yaml"
|
||
echo " Add: VITE_STRIPE_PUBLISHABLE_KEY: \"pk_live_XXXX\""
|
||
echo ""
|
||
echo "2. Edit Secrets:"
|
||
echo " File: infrastructure/environments/common/configs/secrets.yaml"
|
||
echo " Add to payment-secrets (base64 encoded):"
|
||
echo " STRIPE_SECRET_KEY: <base64-encoded>"
|
||
echo " STRIPE_WEBHOOK_SECRET: <base64-encoded>"
|
||
echo ""
|
||
echo "Get keys from: https://dashboard.stripe.com/apikeys"
|
||
echo "================================================================"
|
||
read -p "Press Enter when Stripe keys are configured..."
|
||
|
||
# ========================================
|
||
# STEP 7: Validate Kustomization Build
|
||
# ========================================
|
||
echo -e "\nStep 7: Validating Kustomization..."
|
||
cd infrastructure/environments/prod/k8s-manifests
|
||
kustomize build . > /dev/null 2>&1 && \
|
||
echo "✅ Kustomization builds successfully" || \
|
||
echo "⚠️ WARNING: Kustomization build failed"
|
||
cd - > /dev/null
|
||
|
||
# ========================================
|
||
# FINAL VALIDATION
|
||
# ========================================
|
||
echo -e "\n========================================"
|
||
echo "Pre-Deployment Configuration Complete!"
|
||
echo "========================================"
|
||
echo ""
|
||
echo "Validation Checklist:"
|
||
echo " ✅ Infrastructure structure verified"
|
||
echo " ✅ Image tags updated to v${VERSION}"
|
||
echo " ✅ Production secrets pre-configured"
|
||
echo " ✅ Cert-manager email: admin@bakewise.ai"
|
||
echo " ✅ imagePullSecrets auto-patched via Kustomize"
|
||
echo " ⚠️ Stripe keys configured (manual verification)"
|
||
echo " ✅ Pilot coupon auto-seeded on startup"
|
||
echo ""
|
||
echo "Next Steps:"
|
||
echo " 1. Deploy CI/CD: Gitea, Tekton, Flux (via Helm)"
|
||
echo " 2. Push images to Gitea registry"
|
||
echo " 3. Apply Kustomization to cluster"
|
||
```
|
||
|
||
### Manual Verification
|
||
|
||
After running the script above:
|
||
|
||
1. **Verify production secrets are configured:**
|
||
```bash
|
||
# Check secrets file has strong passwords
|
||
head -80 infrastructure/environments/common/configs/secrets.yaml
|
||
# Should show base64-encoded passwords for all 19 databases
|
||
```
|
||
|
||
2. **Check image tags in production overlay:**
|
||
```bash
|
||
grep "newTag:" infrastructure/environments/prod/k8s-manifests/kustomization.yaml | head -10
|
||
# For releases: should show v1.0.0 (your version)
|
||
# For development: 'latest' is acceptable
|
||
```
|
||
|
||
3. **Verify imagePullSecrets patch:**
|
||
```bash
|
||
grep -B 2 -A 6 "imagePullSecrets" infrastructure/environments/prod/k8s-manifests/kustomization.yaml
|
||
# Should show patches for Deployment, StatefulSet, Job, CronJob
|
||
```
|
||
|
||
4. **Verify OTEL/SigNoz configuration:**
|
||
```bash
|
||
grep "OTEL_EXPORTER" infrastructure/environments/prod/k8s-manifests/kustomization.yaml
|
||
# Should show: http://signoz-otel-collector.bakery-ia.svc.cluster.local:4317
|
||
```
|
||
|
||
5. **Test Kustomize build:**
|
||
```bash
|
||
cd infrastructure/environments/prod/k8s-manifests
|
||
kustomize build . | kubectl apply --dry-run=client -f -
|
||
# Should complete without errors
|
||
```
|
||
|
||
### Key File Locations Reference
|
||
|
||
| Configuration | File Path |
|
||
|---------------|-----------|
|
||
| ConfigMap | `infrastructure/environments/common/configs/configmap.yaml` |
|
||
| Secrets | `infrastructure/environments/common/configs/secrets.yaml` |
|
||
| Prod Kustomization | `infrastructure/environments/prod/k8s-manifests/kustomization.yaml` |
|
||
| Cert-Manager Issuer | `infrastructure/platform/cert-manager/cluster-issuer-production.yaml` |
|
||
| Ingress | `infrastructure/platform/networking/ingress/base/ingress.yaml` |
|
||
| Gitea Values | `infrastructure/cicd/gitea/values.yaml` |
|
||
| Mailu Values | `infrastructure/platform/mail/mailu-helm/values.yaml` |
|
||
|
||
---
|
||
|
||
## Pre-Launch Checklist
|
||
|
||
### Required Accounts & Services
|
||
|
||
- [ ] **Domain Name**
|
||
- Register at Namecheap or Cloudflare (€10-15/year)
|
||
- Suggested: `bakeryforecast.es` or `bakery-ia.com`
|
||
|
||
- [ ] **VPS Account**
|
||
- Sign up at [clouding.io](https://www.clouding.io)
|
||
- Payment method configured
|
||
|
||
- [ ] **Email Service** - Self-hosted Mailu with Mailgun relay
|
||
- Mailu deployed via Helm chart (see [Mailu Email Server Deployment](#mailu-email-server-deployment))
|
||
- Mailgun account for outbound relay (improves deliverability)
|
||
- DNS records configured (MX, SPF, DKIM, DMARC)
|
||
|
||
- [ ] **WhatsApp Business API**
|
||
- Create Meta Business Account (free)
|
||
- Verify business identity
|
||
- Phone number ready (non-VoIP)
|
||
|
||
- [ ] **DNS Access**
|
||
- Cloudflare account (free, recommended)
|
||
- Or domain registrar DNS panel access
|
||
|
||
- [ ] **Container Registry** (Choose ONE)
|
||
- Option A: Docker Hub account (recommended)
|
||
- Option B: GitHub Container Registry
|
||
- Option C: MicroK8s built-in registry
|
||
|
||
### Required Tools on Local Machine
|
||
|
||
```bash
|
||
# Verify you have these installed:
|
||
kubectl version --client
|
||
docker --version
|
||
git --version
|
||
ssh -V
|
||
openssl version
|
||
|
||
# Install if missing (macOS):
|
||
brew install kubectl docker git openssh openssl
|
||
```
|
||
|
||
### Repository Setup
|
||
|
||
```bash
|
||
# Clone the repository
|
||
git clone https://github.com/yourusername/bakery-ia.git
|
||
cd bakery-ia
|
||
|
||
# Verify structure
|
||
ls infrastructure/kubernetes/overlays/prod/
|
||
```
|
||
|
||
---
|
||
|
||
## VPS Provisioning
|
||
|
||
### Recommended Configuration
|
||
|
||
**For 10-tenant pilot program:**
|
||
- **RAM:** 20 GB
|
||
- **CPU:** 8 vCPU cores
|
||
- **Storage:** 200 GB NVMe SSD (triple replica)
|
||
- **Network:** 1 Gbps connection
|
||
- **OS:** Ubuntu 22.04 LTS
|
||
- **Monthly Cost:** €40-80 (check current pricing)
|
||
|
||
### Why These Specs?
|
||
|
||
**Memory Breakdown:**
|
||
- Application services: 14.1 GB
|
||
- Databases (18 instances): 4.6 GB
|
||
- Infrastructure (Redis, RabbitMQ): 0.8 GB
|
||
- Gateway/Frontend: 1.8 GB
|
||
- Monitoring: 1.5 GB
|
||
- System overhead: ~3 GB
|
||
- **Total:** ~26 GB capacity needed, 20 GB is sufficient with HPA
|
||
|
||
**Storage Breakdown:**
|
||
- Databases: 36 GB (18 × 2GB)
|
||
- ML Models: 10 GB
|
||
- Redis: 1 GB
|
||
- RabbitMQ: 2 GB
|
||
- Prometheus metrics: 20 GB
|
||
- Container images: ~30 GB
|
||
- Growth buffer: 100 GB
|
||
- **Total:** 199 GB
|
||
|
||
### Provisioning Steps
|
||
|
||
1. **Create VPS at clouding.io:**
|
||
```
|
||
1. Log in to clouding.io dashboard
|
||
2. Click "Create New Server"
|
||
3. Select:
|
||
- OS: Ubuntu 22.04 LTS
|
||
- RAM: 20 GB
|
||
- CPU: 8 vCPU
|
||
- Storage: 200 GB NVMe SSD
|
||
- Location: Barcelona (best for Spain)
|
||
4. Set hostname: bakery-ia-prod-01
|
||
5. Add SSH key (or use password)
|
||
6. Create server
|
||
```
|
||
|
||
2. **Note your server details:**
|
||
```bash
|
||
# Save these for later:
|
||
VPS_IP="YOUR_VPS_IP_ADDRESS"
|
||
VPS_ROOT_PASSWORD="YOUR_ROOT_PASSWORD" # If not using SSH key
|
||
```
|
||
|
||
3. **Initial SSH connection:**
|
||
```bash
|
||
# Test connection
|
||
ssh root@$VPS_IP
|
||
|
||
# Update system
|
||
apt update && apt upgrade -y
|
||
```
|
||
|
||
---
|
||
|
||
## Infrastructure Setup
|
||
|
||
### Step 1: Install MicroK8s
|
||
|
||
**Using MicroK8s for production VPS deployment on clouding.io**
|
||
|
||
```bash
|
||
# SSH into your VPS
|
||
ssh root@$VPS_IP
|
||
|
||
# Update system
|
||
apt update && apt upgrade -y
|
||
|
||
# Install MicroK8s
|
||
snap install microk8s --classic --channel=1.28/stable
|
||
|
||
# Add your user to microk8s group
|
||
usermod -a -G microk8s $USER
|
||
chown -f -R $USER ~/.kube
|
||
newgrp microk8s
|
||
|
||
# Verify installation
|
||
microk8s status --wait-ready
|
||
```
|
||
|
||
### Step 2: Enable Required MicroK8s Addons
|
||
|
||
**All required components are available as MicroK8s addons:**
|
||
|
||
```bash
|
||
# Enable core addons
|
||
microk8s enable dns # DNS resolution within cluster
|
||
microk8s enable hostpath-storage # Provides microk8s-hostpath storage class
|
||
microk8s enable ingress # Nginx ingress controller (uses class "public")
|
||
microk8s enable cert-manager # Let's Encrypt SSL certificates
|
||
microk8s enable metrics-server # For HPA autoscaling
|
||
microk8s enable rbac # Role-based access control
|
||
|
||
# Setup kubectl alias
|
||
echo "alias kubectl='microk8s kubectl'" >> ~/.bashrc
|
||
source ~/.bashrc
|
||
|
||
# Verify all components are running
|
||
kubectl get nodes
|
||
# Should show: Ready
|
||
|
||
kubectl get storageclass
|
||
# Should show: microk8s-hostpath (default)
|
||
|
||
kubectl get pods -A
|
||
# Should show pods in: kube-system, ingress, cert-manager namespaces
|
||
|
||
# Verify ingress controller is running
|
||
kubectl get pods -n ingress
|
||
# Should show: nginx-ingress-microk8s-controller-xxx Running
|
||
|
||
# Verify cert-manager is running
|
||
kubectl get pods -n cert-manager
|
||
# Should show: cert-manager-xxx, cert-manager-webhook-xxx, cert-manager-cainjector-xxx
|
||
|
||
# Verify metrics-server is working
|
||
kubectl top nodes
|
||
# Should return CPU/Memory metrics
|
||
```
|
||
|
||
**Important - MicroK8s Ingress Class:**
|
||
- MicroK8s ingress addon uses class name `public` (NOT `nginx`)
|
||
- The ClusterIssuers in this repo are already configured with `class: public`
|
||
- If you see cert-manager challenges failing, verify the ingress class matches
|
||
|
||
**Optional but Recommended:**
|
||
```bash
|
||
# Enable Prometheus for additional monitoring (optional)
|
||
microk8s enable prometheus
|
||
|
||
# Enable registry if you want local image storage (optional)
|
||
microk8s enable registry
|
||
```
|
||
|
||
### Step 3: Enhanced Infrastructure Components
|
||
|
||
**The platform includes additional infrastructure components that enhance security, monitoring, and operations:**
|
||
|
||
```bash
|
||
# The platform includes Mailu for email services
|
||
# Deploy Mailu via Helm (optional but recommended for production):
|
||
kubectl create namespace bakery-ia --dry-run=client -o yaml | kubectl apply -f -
|
||
helm repo add mailu https://mailu.github.io/helm-charts
|
||
helm repo update
|
||
helm install mailu mailu/mailu \
|
||
-n bakery-ia \
|
||
-f infrastructure/platform/mail/mailu-helm/values.yaml \
|
||
--timeout 10m \
|
||
--wait
|
||
|
||
# Verify Mailu deployment
|
||
kubectl get pods -n bakery-ia | grep mailu
|
||
```
|
||
|
||
**For development environments, ensure the prepull-base-images script is run:**
|
||
```bash
|
||
# On your local machine, run the prepull script to cache base images
|
||
cd bakery-ia
|
||
chmod +x scripts/prepull-base-images.sh
|
||
./scripts/prepull-base-images.sh
|
||
```
|
||
|
||
**For production environments, ensure CI/CD infrastructure is properly configured:**
|
||
```bash
|
||
# Tekton Pipelines for CI/CD (optional - can be deployed separately)
|
||
kubectl create namespace tekton-pipelines
|
||
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
|
||
kubectl apply -f https://storage.googleapis.com/tekton-releases/triggers/latest/release.yaml
|
||
|
||
# Flux CD for GitOps (already enabled in MicroK8s if needed)
|
||
# flux install --namespace=flux-system --network-policy=false
|
||
```
|
||
|
||
### Step 4: Configure Firewall
|
||
|
||
**CRITICAL:** Ports 80 and 443 must be open for Let's Encrypt HTTP-01 challenges to work.
|
||
|
||
```bash
|
||
# Allow necessary ports
|
||
ufw allow 22/tcp # SSH
|
||
ufw allow 80/tcp # HTTP - REQUIRED for Let's Encrypt HTTP-01 challenge
|
||
ufw allow 443/tcp # HTTPS - For your application traffic
|
||
ufw allow 16443/tcp # Kubernetes API (optional, for remote kubectl access)
|
||
|
||
# Enable firewall
|
||
ufw enable
|
||
|
||
# Check status
|
||
ufw status verbose
|
||
|
||
# Expected output should include:
|
||
# 80/tcp ALLOW Anywhere
|
||
# 443/tcp ALLOW Anywhere
|
||
```
|
||
|
||
**Also check clouding.io firewall:**
|
||
- Log in to clouding.io dashboard
|
||
- Go to your VPS → Firewall settings
|
||
- Ensure ports 80 and 443 are allowed from anywhere (0.0.0.0/0)
|
||
|
||
### Step 5: Create Namespace
|
||
|
||
```bash
|
||
# Create bakery-ia namespace
|
||
kubectl create namespace bakery-ia
|
||
|
||
# Verify
|
||
kubectl get namespaces
|
||
```
|
||
|
||
---
|
||
|
||
## Domain & DNS Configuration
|
||
|
||
### Step 1: Register Domain at Namecheap
|
||
|
||
1. Go to [Namecheap](https://www.namecheap.com)
|
||
2. Search for your desired domain (e.g., `bakewise.ia`)
|
||
3. Complete purchase (~€10-15/year)
|
||
4. Save domain credentials
|
||
|
||
### Step 2: Configure DNS at Namecheap
|
||
|
||
1. **Access DNS settings:**
|
||
```
|
||
1. Log in to Namecheap
|
||
2. Go to Domain List → Manage → Advanced DNS
|
||
```
|
||
|
||
2. **Add DNS records pointing to your VPS:**
|
||
```
|
||
Type Host Value TTL
|
||
A @ YOUR_VPS_IP Automatic
|
||
A * YOUR_VPS_IP Automatic
|
||
```
|
||
|
||
This points both `bakewise.ia` and all subdomains (`*.bakewise.ia`) to your VPS.
|
||
|
||
3. **Test DNS propagation:**
|
||
```bash
|
||
# Wait 5-10 minutes, then test
|
||
nslookup bakewise.ia
|
||
nslookup api.bakewise.ia
|
||
nslookup mail.bakewise.ia
|
||
```
|
||
|
||
### Step 3 (Optional): Configure Cloudflare DNS
|
||
|
||
1. **Add site to Cloudflare:**
|
||
```
|
||
1. Log in to Cloudflare
|
||
2. Click "Add a Site"
|
||
3. Enter your domain name
|
||
4. Choose Free plan
|
||
5. Cloudflare will scan existing DNS records
|
||
```
|
||
|
||
2. **Update nameservers at registrar:**
|
||
```
|
||
Point your domain's nameservers to Cloudflare:
|
||
- NS1: assigned.cloudflare.com
|
||
- NS2: assigned.cloudflare.com
|
||
(Cloudflare will provide the exact values)
|
||
```
|
||
|
||
3. **Add DNS records:**
|
||
```
|
||
Type Name Content TTL Proxy
|
||
A @ YOUR_VPS_IP Auto Yes
|
||
A www YOUR_VPS_IP Auto Yes
|
||
A api YOUR_VPS_IP Auto Yes
|
||
A monitoring YOUR_VPS_IP Auto Yes
|
||
CNAME * yourdomain.com Auto No
|
||
```
|
||
|
||
4. **Configure SSL/TLS mode:**
|
||
```
|
||
SSL/TLS tab → Overview → Set to "Full (strict)"
|
||
```
|
||
|
||
5. **Test DNS propagation:**
|
||
```bash
|
||
# Wait 5-10 minutes, then test
|
||
nslookup yourdomain.com
|
||
nslookup api.yourdomain.com
|
||
```
|
||
|
||
---
|
||
|
||
## TLS/SSL Certificates
|
||
|
||
### Understanding Certificate Setup
|
||
|
||
The platform uses **two layers** of SSL/TLS:
|
||
|
||
1. **External (Ingress) SSL:** Let's Encrypt for public HTTPS
|
||
2. **Internal (Database) SSL:** Self-signed certificates for database connections
|
||
|
||
### Step 1: Generate Internal Certificates
|
||
|
||
```bash
|
||
# On your local machine
|
||
cd infrastructure/tls
|
||
|
||
# Generate certificates
|
||
./generate-certificates.sh
|
||
|
||
# This creates:
|
||
# - ca/ (Certificate Authority)
|
||
# - postgres/ (PostgreSQL server certs)
|
||
# - redis/ (Redis server certs)
|
||
```
|
||
|
||
**Certificate Details:**
|
||
- Root CA: 10-year validity (expires 2035)
|
||
- Server certs: 3-year validity (expires October 2028)
|
||
- Algorithm: RSA 4096-bit
|
||
- Signature: SHA-256
|
||
|
||
### Step 2: Create Kubernetes Secrets
|
||
|
||
```bash
|
||
# Create PostgreSQL TLS secret
|
||
kubectl create secret generic postgres-tls \
|
||
--from-file=server-cert.pem=infrastructure/tls/postgres/server-cert.pem \
|
||
--from-file=server-key.pem=infrastructure/tls/postgres/server-key.pem \
|
||
--from-file=ca-cert.pem=infrastructure/tls/postgres/ca-cert.pem \
|
||
-n bakery-ia
|
||
|
||
# Create Redis TLS secret
|
||
kubectl create secret generic redis-tls \
|
||
--from-file=redis-cert.pem=infrastructure/tls/redis/redis-cert.pem \
|
||
--from-file=redis-key.pem=infrastructure/tls/redis/redis-key.pem \
|
||
--from-file=ca-cert.pem=infrastructure/tls/redis/ca-cert.pem \
|
||
-n bakery-ia
|
||
|
||
# Verify secrets created
|
||
kubectl get secrets -n bakery-ia | grep tls
|
||
```
|
||
|
||
### Step 3: Configure Let's Encrypt (External SSL)
|
||
|
||
cert-manager is already enabled via `microk8s enable cert-manager`. The ClusterIssuer is pre-configured in the repository.
|
||
|
||
**Important:** MicroK8s ingress addon uses ingress class `public` (not `nginx`). This is already configured in:
|
||
- `infrastructure/platform/cert-manager/cluster-issuer-production.yaml`
|
||
- `infrastructure/platform/cert-manager/cluster-issuer-staging.yaml`
|
||
|
||
```bash
|
||
# On VPS, apply the pre-configured ClusterIssuers
|
||
kubectl apply -k infrastructure/platform/cert-manager/
|
||
|
||
# Verify ClusterIssuers are ready
|
||
kubectl get clusterissuer
|
||
kubectl describe clusterissuer letsencrypt-production
|
||
|
||
# Expected output:
|
||
# NAME READY AGE
|
||
# letsencrypt-production True 1m
|
||
# letsencrypt-staging True 1m
|
||
```
|
||
|
||
**Configuration details (already set):**
|
||
- **Email:** `admin@bakewise.ai` (receives Let's Encrypt expiry notifications)
|
||
- **Ingress class:** `public` (MicroK8s default)
|
||
- **Challenge type:** HTTP-01 (requires port 80 open)
|
||
|
||
**If you need to customize the email**, edit before applying:
|
||
```bash
|
||
# Edit the production issuer
|
||
nano infrastructure/platform/cert-manager/cluster-issuer-production.yaml
|
||
# Change: email: admin@bakewise.ai → email: your-email@yourdomain.com
|
||
```
|
||
|
||
---
|
||
|
||
## Email & Communication Setup
|
||
|
||
### Self-Hosted Mailu with Mailgun Relay
|
||
|
||
**Architecture:**
|
||
- **Mailu** - Self-hosted email server (Postfix, Dovecot, Rspamd, Roundcube webmail)
|
||
- **Mailgun** - External SMTP relay for improved outbound deliverability
|
||
- **Helm deployment** - `infrastructure/platform/mail/mailu-helm/`
|
||
|
||
**Features:**
|
||
- ✅ Full control over email infrastructure
|
||
- ✅ Mailgun relay improves deliverability (avoids VPS IP reputation issues)
|
||
- ✅ Built-in antispam (rspamd) with DNSSEC validation
|
||
- ✅ Webmail interface (Roundcube) at `/webmail`
|
||
- ✅ Admin panel at `/admin`
|
||
- ✅ IMAP/SMTP with TLS
|
||
- ✅ Professional addresses: admin@bakewise.ai, noreply@bakewise.ai
|
||
|
||
**Configuration Files:**
|
||
| File | Purpose |
|
||
|------|---------|
|
||
| `infrastructure/platform/mail/mailu-helm/values.yaml` | Base Mailu configuration |
|
||
| `infrastructure/platform/mail/mailu-helm/prod/values.yaml` | Production overrides |
|
||
| `infrastructure/platform/mail/mailu-helm/configs/mailgun-credentials-secret.yaml` | Mailgun SMTP credentials |
|
||
|
||
**Internal SMTP for Application Services:**
|
||
```yaml
|
||
# Services use Mailu's internal postfix for sending
|
||
SMTP_HOST: mailu-postfix.bakery-ia.svc.cluster.local
|
||
SMTP_PORT: 587
|
||
```
|
||
|
||
#### Prerequisites
|
||
|
||
Before deploying Mailu, ensure:
|
||
1. **Unbound DNS is deployed** (for DNSSEC validation)
|
||
2. **CoreDNS is configured** to forward to Unbound
|
||
3. **DNS records are configured** for your domain
|
||
|
||
#### Step 1: Configure DNS Records
|
||
|
||
Add these DNS records for your domain (e.g., bakewise.ai):
|
||
|
||
```
|
||
Type Name Value TTL
|
||
A mail YOUR_VPS_IP Auto
|
||
MX @ mail.bakewise.ai (priority 10) Auto
|
||
TXT @ v=spf1 mx a -all Auto
|
||
TXT _dmarc v=DMARC1; p=reject; rua=... Auto
|
||
```
|
||
|
||
**DKIM record** will be generated after Mailu is running - you'll add it later.
|
||
|
||
#### Step 2: Deploy Unbound DNS Resolver
|
||
|
||
Unbound provides DNSSEC validation required by Mailu for email authentication.
|
||
|
||
```bash
|
||
# On VPS - Deploy Unbound via Helm
|
||
helm upgrade --install unbound infrastructure/platform/networking/dns/unbound-helm \
|
||
-n bakery-ia \
|
||
--create-namespace \
|
||
-f infrastructure/platform/networking/dns/unbound-helm/values.yaml \
|
||
-f infrastructure/platform/networking/dns/unbound-helm/prod/values.yaml \
|
||
--timeout 5m \
|
||
--wait
|
||
|
||
# Verify Unbound is running
|
||
kubectl get pods -n bakery-ia | grep unbound
|
||
# Should show: unbound-xxx 1/1 Running
|
||
|
||
# Get Unbound service IP (needed for CoreDNS configuration)
|
||
UNBOUND_IP=$(kubectl get svc unbound-dns -n bakery-ia -o jsonpath='{.spec.clusterIP}')
|
||
echo "Unbound DNS IP: $UNBOUND_IP"
|
||
```
|
||
|
||
#### Step 3: Configure CoreDNS for DNSSEC
|
||
|
||
Mailu requires DNSSEC validation. Configure CoreDNS to forward external queries to Unbound:
|
||
|
||
```bash
|
||
# Get the Unbound service IP
|
||
UNBOUND_IP=$(kubectl get svc unbound-dns -n bakery-ia -o jsonpath='{.spec.clusterIP}')
|
||
|
||
# Patch CoreDNS to forward to Unbound
|
||
kubectl patch configmap coredns -n kube-system --type merge -p "{
|
||
\"data\": {
|
||
\"Corefile\": \".:53 {\\n errors\\n health {\\n lameduck 5s\\n }\\n ready\\n kubernetes cluster.local in-addr.arpa ip6.arpa {\\n pods insecure\\n fallthrough in-addr.arpa ip6.arpa\\n ttl 30\\n }\\n prometheus :9153\\n forward . $UNBOUND_IP {\\n max_concurrent 1000\\n }\\n cache 30 {\\n disable success cluster.local\\n disable denial cluster.local\\n }\\n loop\\n reload\\n loadbalance\\n}\\n\"
|
||
}
|
||
}"
|
||
|
||
# Restart CoreDNS to apply changes
|
||
kubectl rollout restart deployment coredns -n kube-system
|
||
kubectl rollout status deployment coredns -n kube-system --timeout=60s
|
||
|
||
# Verify DNSSEC is working
|
||
kubectl run -it --rm debug --image=alpine --restart=Never -- \
|
||
sh -c "apk add drill && drill -D google.com"
|
||
# Should show: ;; flags: ... ad ... (ad = authenticated data = DNSSEC valid)
|
||
```
|
||
|
||
#### Step 4: Create TLS Certificate Secret
|
||
|
||
Mailu Front pod requires a TLS certificate:
|
||
|
||
```bash
|
||
# Generate self-signed certificate for internal use
|
||
# (Let's Encrypt handles external TLS via Ingress)
|
||
TEMP_DIR=$(mktemp -d)
|
||
cd "$TEMP_DIR"
|
||
|
||
openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
|
||
-keyout tls.key -out tls.crt \
|
||
-subj "/CN=mail.bakewise.ai/O=bakewise"
|
||
|
||
kubectl create secret tls mailu-certificates \
|
||
--cert=tls.crt \
|
||
--key=tls.key \
|
||
-n bakery-ia
|
||
|
||
rm -rf "$TEMP_DIR"
|
||
|
||
# Verify secret created
|
||
kubectl get secret mailu-certificates -n bakery-ia
|
||
```
|
||
|
||
#### Step 5: Create Admin Credentials Secret
|
||
|
||
```bash
|
||
# Generate a secure password (or use your own)
|
||
ADMIN_PASSWORD=$(openssl rand -base64 16 | tr -d '/+=' | head -c 16)
|
||
echo "Admin password: $ADMIN_PASSWORD"
|
||
echo "SAVE THIS PASSWORD SECURELY!"
|
||
|
||
# Create the admin credentials secret
|
||
kubectl create secret generic mailu-admin-credentials \
|
||
--from-literal=password="$ADMIN_PASSWORD" \
|
||
-n bakery-ia
|
||
```
|
||
|
||
#### Step 6: Deploy Mailu via Helm
|
||
|
||
```bash
|
||
# Add Mailu Helm repository
|
||
helm repo add mailu https://mailu.github.io/helm-charts
|
||
helm repo update mailu
|
||
|
||
# Deploy Mailu with production values
|
||
# Admin user is created automatically via initialAccount feature
|
||
helm upgrade --install mailu mailu/mailu \
|
||
-n bakery-ia \
|
||
--create-namespace \
|
||
-f infrastructure/platform/mail/mailu-helm/values.yaml \
|
||
-f infrastructure/platform/mail/mailu-helm/prod/values.yaml \
|
||
--timeout 10m
|
||
|
||
# Wait for pods to be ready (may take 5-10 minutes for ClamAV)
|
||
kubectl get pods -n bakery-ia -l app.kubernetes.io/instance=mailu -w
|
||
|
||
# Admin user (admin@bakewise.ai) is created automatically!
|
||
# Password is the one you set in Step 5
|
||
```
|
||
|
||
#### Step 7: Configure DKIM
|
||
|
||
After Mailu is running, get the DKIM key and add it to DNS:
|
||
|
||
```bash
|
||
# Get DKIM public key
|
||
kubectl exec -n bakery-ia deployment/mailu-admin -- \
|
||
cat /dkim/bakewise.ai.dkim.pub
|
||
|
||
# Add this as a TXT record in your DNS:
|
||
# Name: dkim._domainkey
|
||
# Value: (the key from above)
|
||
```
|
||
|
||
#### Step 8: Verify Email Setup
|
||
|
||
```bash
|
||
# Check all Mailu pods are running
|
||
kubectl get pods -n bakery-ia | grep mailu
|
||
# Expected: All 10 pods in Running state
|
||
|
||
# Test SMTP connectivity
|
||
kubectl run -it --rm smtp-test --image=alpine --restart=Never -- \
|
||
sh -c "apk add swaks && swaks --to test@example.com --from admin@bakewise.ai --server mailu-front.bakery-ia.svc.cluster.local:25"
|
||
|
||
# Access webmail (via port-forward for testing)
|
||
kubectl port-forward -n bakery-ia svc/mailu-front 8080:80
|
||
# Open: http://localhost:8080/webmail
|
||
```
|
||
|
||
#### Production Email Endpoints
|
||
|
||
| Service | URL/Address |
|
||
|---------|-------------|
|
||
| Admin Panel | https://mail.bakewise.ai/admin |
|
||
| Webmail | https://mail.bakewise.ai/webmail |
|
||
| SMTP (STARTTLS) | mail.bakewise.ai:587 |
|
||
| SMTP (SSL) | mail.bakewise.ai:465 |
|
||
| IMAP (SSL) | mail.bakewise.ai:993 |
|
||
|
||
#### Troubleshooting Mailu
|
||
|
||
**Issue: Admin pod CrashLoopBackOff with "DNSSEC validation" error**
|
||
```bash
|
||
# Verify CoreDNS is forwarding to Unbound
|
||
kubectl get configmap coredns -n kube-system -o yaml | grep forward
|
||
# Should show: forward . <unbound-ip>
|
||
|
||
# If not, re-run Step 3 above
|
||
```
|
||
|
||
**Issue: Front pod stuck in ContainerCreating**
|
||
```bash
|
||
# Check for missing certificate secret
|
||
kubectl describe pod -n bakery-ia -l app.kubernetes.io/component=front | grep -A5 Events
|
||
|
||
# If missing mailu-certificates, re-run Step 4 above
|
||
```
|
||
|
||
**Issue: Admin pod can't connect to Redis**
|
||
```bash
|
||
# Verify externalRedis is disabled in values
|
||
helm get values mailu -n bakery-ia | grep -A5 externalRedis
|
||
# Should show: enabled: false
|
||
|
||
# If enabled: true, upgrade with correct values
|
||
helm upgrade mailu mailu/mailu -n bakery-ia \
|
||
-f infrastructure/platform/mail/mailu-helm/values.yaml \
|
||
-f infrastructure/platform/mail/mailu-helm/prod/values.yaml
|
||
```
|
||
|
||
---
|
||
|
||
### WhatsApp Business API Setup
|
||
|
||
**Features:**
|
||
- ✅ First 1,000 conversations/month FREE
|
||
- ✅ Perfect for 10 tenants (~500 messages/month)
|
||
|
||
**Setup Steps:**
|
||
|
||
1. **Create Meta Business Account:**
|
||
```
|
||
1. Go to business.facebook.com
|
||
2. Create Business Account
|
||
3. Complete business verification
|
||
```
|
||
|
||
2. **Add WhatsApp Product:**
|
||
```
|
||
1. Go to developers.facebook.com
|
||
2. Create New App → Business
|
||
3. Add WhatsApp product
|
||
4. Complete setup wizard
|
||
```
|
||
|
||
3. **Configure Phone Number:**
|
||
```
|
||
1. Test with your personal number initially
|
||
2. Later: Get dedicated business number
|
||
3. Verify phone number with SMS code
|
||
```
|
||
|
||
4. **Create Message Templates:**
|
||
```
|
||
1. Go to WhatsApp Manager
|
||
2. Create templates for:
|
||
- Low inventory alert
|
||
- Expired product alert
|
||
- Forecast summary
|
||
- Order notification
|
||
3. Submit for approval (15 min - 24 hours)
|
||
```
|
||
|
||
5. **Get API Credentials:**
|
||
```
|
||
Save these values:
|
||
- Phone Number ID: (from WhatsApp Manager)
|
||
- Access Token: (from App Dashboard)
|
||
- Business Account ID: (from WhatsApp Manager)
|
||
- Webhook Verify Token: (create your own secure string)
|
||
```
|
||
|
||
---
|
||
|
||
## Kubernetes Deployment
|
||
|
||
### Step 1: Prepare Container Images
|
||
|
||
#### Option A: Using Docker Hub (Recommended)
|
||
|
||
```bash
|
||
# On your local machine
|
||
docker login
|
||
|
||
# Build all images
|
||
docker-compose build
|
||
|
||
# Tag images for Docker Hub
|
||
# Replace YOUR_USERNAME with your Docker Hub username
|
||
export DOCKER_USERNAME="YOUR_USERNAME"
|
||
|
||
./scripts/tag-images.sh $DOCKER_USERNAME
|
||
|
||
# Push to Docker Hub
|
||
./scripts/push-images.sh $DOCKER_USERNAME
|
||
|
||
# Update prod kustomization with your username
|
||
# Edit: infrastructure/kubernetes/overlays/prod/kustomization.yaml
|
||
# Replace all "bakery/" with "$DOCKER_USERNAME/"
|
||
```
|
||
|
||
#### Option B: Using MicroK8s Registry
|
||
|
||
```bash
|
||
# On VPS
|
||
microk8s enable registry
|
||
|
||
# Get registry address (usually localhost:32000)
|
||
kubectl get service -n container-registry
|
||
|
||
# On local machine, configure insecure registry
|
||
# Edit /etc/docker/daemon.json:
|
||
{
|
||
"insecure-registries": ["YOUR_VPS_IP:32000"]
|
||
}
|
||
|
||
# Restart Docker
|
||
sudo systemctl restart docker
|
||
|
||
# Tag and push images
|
||
docker tag bakery/auth-service YOUR_VPS_IP:32000/bakery/auth-service
|
||
docker push YOUR_VPS_IP:32000/bakery/auth-service
|
||
# Repeat for all services...
|
||
```
|
||
|
||
### Step 2: Update Production Configuration
|
||
|
||
**⚠️ CRITICAL:** The default configuration uses **bakewise.ai** domain. You MUST update this before deployment if using a different domain.
|
||
|
||
#### Required Configuration Updates
|
||
|
||
**Step 2.1: Remove imagePullSecrets**
|
||
|
||
```bash
|
||
# On your local machine
|
||
cd bakery-ia
|
||
|
||
# Remove imagePullSecrets from all deployment files
|
||
find infrastructure/kubernetes/base -name "*.yaml" -type f -exec sed -i.bak '/imagePullSecrets:/,+1d' {} \;
|
||
|
||
# Verify removal
|
||
grep -r "imagePullSecrets" infrastructure/kubernetes/base/
|
||
# Should return NO results
|
||
```
|
||
|
||
**Step 2.2: Update Image Tags (Use Semantic Versions)**
|
||
|
||
```bash
|
||
# Edit kustomization.yaml to replace 'latest' with actual version
|
||
nano infrastructure/kubernetes/overlays/prod/kustomization.yaml
|
||
|
||
# Find the images section (lines 163-196) and update:
|
||
# BEFORE:
|
||
# - name: bakery/auth-service
|
||
# newTag: latest
|
||
# AFTER:
|
||
# - name: bakery/auth-service
|
||
# newTag: v1.0.0
|
||
|
||
# Do this for ALL 22 services, or use this helper:
|
||
export VERSION="1.0.0" # Your version
|
||
|
||
# Create a script to update all image tags
|
||
cat > /tmp/update-tags.sh <<'EOF'
|
||
#!/bin/bash
|
||
VERSION="${1:-1.0.0}"
|
||
sed -i "s/newTag: latest/newTag: v${VERSION}/g" infrastructure/kubernetes/overlays/prod/kustomization.yaml
|
||
EOF
|
||
|
||
chmod +x /tmp/update-tags.sh
|
||
/tmp/update-tags.sh ${VERSION}
|
||
|
||
# Verify no 'latest' tags remain
|
||
grep "newTag:" infrastructure/kubernetes/overlays/prod/kustomization.yaml | grep -c "latest"
|
||
# Should return: 0
|
||
```
|
||
|
||
**Step 2.3: Fix SigNoz Namespace References**
|
||
|
||
```bash
|
||
# Update SigNoz patches to use bakery-ia namespace instead of signoz
|
||
sed -i 's/namespace: signoz/namespace: bakery-ia/g' infrastructure/kubernetes/overlays/prod/kustomization.yaml
|
||
|
||
# Verify changes (should show bakery-ia in all 3 patches)
|
||
grep -A 3 "name: signoz" infrastructure/kubernetes/overlays/prod/kustomization.yaml
|
||
```
|
||
|
||
**Step 2.4: Update Cert-Manager Email**
|
||
|
||
```bash
|
||
# Update Let's Encrypt notification email to your production email
|
||
sed -i "s/admin@bakery-ia.local/admin@bakewise.ai/g" \
|
||
infrastructure/kubernetes/base/components/cert-manager/cluster-issuer-production.yaml
|
||
```
|
||
|
||
**Step 2.5: Verify Production Secrets (Already Configured) ✅**
|
||
|
||
```bash
|
||
# Production secrets have been pre-configured with strong cryptographic passwords
|
||
# No manual action required - secrets are already set in secrets.yaml
|
||
|
||
# Verify the secrets are configured (optional)
|
||
echo "Verifying production secrets configuration..."
|
||
grep "JWT_SECRET_KEY" infrastructure/kubernetes/base/secrets.yaml | head -1
|
||
grep "AUTH_DB_PASSWORD" infrastructure/kubernetes/base/secrets.yaml | head -1
|
||
grep "REDIS_PASSWORD" infrastructure/kubernetes/base/secrets.yaml | head -1
|
||
|
||
echo "✅ All production secrets are configured and ready for deployment"
|
||
```
|
||
|
||
**Production URLs:**
|
||
- **Main Application:** https://bakewise.ai
|
||
- **API Endpoints:** https://bakewise.ai/api/v1/...
|
||
- **SigNoz (Monitoring):** https://monitoring.bakewise.ai/signoz
|
||
- **AlertManager:** https://monitoring.bakewise.ai/alertmanager
|
||
|
||
---
|
||
|
||
## Configuration & Secrets
|
||
|
||
### Production Secrets Status ✅
|
||
|
||
**All core secrets have been pre-configured with strong cryptographic passwords:**
|
||
- ✅ **Database passwords** (19 databases) - 24-character random strings
|
||
- ✅ **JWT secrets** - 256-bit cryptographically secure tokens
|
||
- ✅ **Service API key** - 64-character hexadecimal string
|
||
- ✅ **Redis password** - 24-character random string
|
||
- ✅ **RabbitMQ password** - 24-character random string
|
||
- ✅ **RabbitMQ Erlang cookie** - 64-character hexadecimal string
|
||
|
||
### Step 1: Configure External Service Credentials (Email & WhatsApp)
|
||
|
||
You still need to update these external service credentials:
|
||
|
||
```bash
|
||
# Edit the secrets file
|
||
nano infrastructure/kubernetes/base/secrets.yaml
|
||
|
||
# Update ONLY these external service credentials:
|
||
|
||
# SMTP settings (from email setup):
|
||
SMTP_USER: <base64-encoded-username> # your email
|
||
SMTP_PASSWORD: <base64-encoded-password> # app password
|
||
|
||
# WhatsApp credentials (from WhatsApp setup - optional):
|
||
WHATSAPP_API_KEY: <base64-encoded-key>
|
||
|
||
# Payment processing (from Stripe setup):
|
||
STRIPE_SECRET_KEY: <base64-encoded-key>
|
||
STRIPE_WEBHOOK_SECRET: <base64-encoded-secret>
|
||
```
|
||
|
||
**To base64 encode:**
|
||
```bash
|
||
echo -n "your-value-here" | base64
|
||
```
|
||
|
||
**CRITICAL:** Never commit real secrets to git! The secrets.yaml file should be in `.gitignore`.
|
||
|
||
### Step 2: CI/CD Secrets Configuration
|
||
|
||
**For production CI/CD setup, additional secrets are required:**
|
||
|
||
```bash
|
||
# Create Docker Hub credentials secret (for image pulls)
|
||
kubectl create secret docker-registry dockerhub-creds \
|
||
--docker-server=docker.io \
|
||
--docker-username=YOUR_DOCKERHUB_USERNAME \
|
||
--docker-password=YOUR_DOCKERHUB_TOKEN \
|
||
--docker-email=your-email@example.com \
|
||
-n bakery-ia
|
||
|
||
# Create Gitea registry credentials (if using Gitea for CI/CD)
|
||
kubectl create secret docker-registry gitea-registry-credentials \
|
||
-n tekton-pipelines \
|
||
--docker-server=gitea.bakery-ia.local:5000 \
|
||
--docker-username=your-username \
|
||
--docker-password=your-password
|
||
|
||
# Create Git credentials for Flux (if using GitOps)
|
||
kubectl create secret generic gitea-credentials \
|
||
-n flux-system \
|
||
--from-literal=username=your-username \
|
||
--from-literal=password=your-password
|
||
```
|
||
|
||
### Step 3: Apply Application Secrets
|
||
|
||
```bash
|
||
# Copy manifests to VPS (from local machine)
|
||
scp -r infrastructure/kubernetes root@YOUR_VPS_IP:~/
|
||
|
||
# SSH to VPS
|
||
ssh root@YOUR_VPS_IP
|
||
|
||
# Apply application secrets
|
||
kubectl apply -f ~/infrastructure/kubernetes/base/secrets.yaml -n bakery-ia
|
||
|
||
# Verify secrets created
|
||
kubectl get secrets -n bakery-ia
|
||
# Should show multiple secrets including postgres-tls, redis-tls, app-secrets, etc.
|
||
```
|
||
|
||
---
|
||
|
||
## Database Migrations
|
||
|
||
### Step 0: Deploy CI/CD Infrastructure (Optional but Recommended)
|
||
|
||
**For production environments, deploy CI/CD infrastructure components:**
|
||
|
||
```bash
|
||
# Deploy Tekton Pipelines for CI/CD (optional but recommended for production)
|
||
kubectl create namespace tekton-pipelines
|
||
|
||
# Install Tekton Pipelines
|
||
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
|
||
|
||
# Install Tekton Triggers
|
||
kubectl apply -f https://storage.googleapis.com/tekton-releases/triggers/latest/release.yaml
|
||
|
||
# Apply Tekton configurations
|
||
kubectl apply -f ~/infrastructure/cicd/tekton/tasks/
|
||
kubectl apply -f ~/infrastructure/cicd/tekton/pipelines/
|
||
kubectl apply -f ~/infrastructure/cicd/tekton/triggers/
|
||
|
||
# Verify Tekton deployment
|
||
kubectl get pods -n tekton-pipelines
|
||
```
|
||
|
||
### Step 1: Deploy SigNoz Monitoring (BEFORE Application)
|
||
|
||
**⚠️ CRITICAL:** SigNoz must be deployed BEFORE the application into the **bakery-ia namespace** because the production kustomization patches SigNoz resources.
|
||
|
||
```bash
|
||
# On VPS
|
||
# 1. Ensure bakery-ia namespace exists
|
||
kubectl get namespace bakery-ia || kubectl create namespace bakery-ia
|
||
|
||
# 2. Add Helm repo
|
||
helm repo add signoz https://charts.signoz.io
|
||
helm repo update
|
||
|
||
# 3. Install SigNoz into bakery-ia namespace (NOT separate signoz namespace)
|
||
helm install signoz signoz/signoz \
|
||
-n bakery-ia \
|
||
--set frontend.service.type=ClusterIP \
|
||
--set clickhouse.persistence.size=20Gi \
|
||
--set clickhouse.persistence.storageClass=microk8s-hostpath
|
||
|
||
# 4. Wait for SigNoz to be ready (this may take 10-15 minutes)
|
||
kubectl wait --for=condition=ready pod \
|
||
-l app.kubernetes.io/instance=signoz \
|
||
-n bakery-ia \
|
||
--timeout=900s
|
||
|
||
# 5. Verify SigNoz components running in bakery-ia namespace
|
||
kubectl get pods -n bakery-ia -l app.kubernetes.io/instance=signoz
|
||
# Should show: signoz-0, signoz-otel-collector, signoz-clickhouse, signoz-zookeeper, signoz-alertmanager
|
||
|
||
# 6. Verify StatefulSets exist (kustomization will patch these)
|
||
kubectl get statefulset -n bakery-ia | grep signoz
|
||
# Should show: signoz, signoz-clickhouse
|
||
```
|
||
|
||
**⚠️ Important:** Do NOT create a separate `signoz` namespace. SigNoz must be in `bakery-ia` namespace for the overlays to work correctly.
|
||
|
||
### Step 2: Deploy Application and Databases
|
||
|
||
```bash
|
||
# On VPS
|
||
kubectl apply -k ~/infrastructure/kubernetes/overlays/prod
|
||
|
||
# Wait for databases to be ready (5-10 minutes)
|
||
kubectl wait --for=condition=ready pod \
|
||
-l app.kubernetes.io/component=database \
|
||
-n bakery-ia \
|
||
--timeout=600s
|
||
|
||
# Check status
|
||
kubectl get pods -n bakery-ia -l app.kubernetes.io/component=database
|
||
```
|
||
|
||
### Step 2: Run Migrations
|
||
|
||
Migrations are automatically handled by init containers in each service. Verify they completed:
|
||
|
||
```bash
|
||
# Check migration job status
|
||
kubectl get jobs -n bakery-ia | grep migration
|
||
|
||
# All should show "COMPLETIONS = 1/1"
|
||
|
||
# Check logs if any failed
|
||
kubectl logs -n bakery-ia job/auth-migration
|
||
```
|
||
|
||
### Step 3: Verify Database Schemas
|
||
|
||
```bash
|
||
# Connect to a database to verify
|
||
kubectl exec -n bakery-ia deployment/auth-db -it -- psql -U auth_user -d auth_db
|
||
|
||
# Inside psql:
|
||
\dt # List tables
|
||
\d users # Describe users table
|
||
\q # Quit
|
||
```
|
||
|
||
---
|
||
|
||
## CI/CD Infrastructure Deployment
|
||
|
||
This section covers deploying the complete CI/CD stack: Gitea (Git server + container registry), Tekton (CI pipelines), and Flux CD (GitOps deployments).
|
||
|
||
### Overview
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||
│ CI/CD ARCHITECTURE │
|
||
│ │
|
||
│ Developer Push │
|
||
│ │ │
|
||
│ ▼ │
|
||
│ ┌─────────┐ Webhook ┌─────────────┐ Build/Test ┌─────────┐ │
|
||
│ │ Gitea │ ───────────────► │ Tekton │ ─────────────────►│ Images │ │
|
||
│ │ (Git) │ │ (Pipelines)│ │(Registry)│ │
|
||
│ └─────────┘ └─────────────┘ └─────────┘ │
|
||
│ │ │ │ │
|
||
│ │ │ Update manifests │ │
|
||
│ │ ▼ │ │
|
||
│ │ ┌─────────────┐ │ │
|
||
│ └──────────────────────►│ Flux CD │◄───────────────────────┘ │
|
||
│ Monitor changes │ (GitOps) │ Pull images │
|
||
│ └─────────────┘ │
|
||
│ │ │
|
||
│ ▼ │
|
||
│ ┌─────────────┐ │
|
||
│ │ Kubernetes │ │
|
||
│ │ Cluster │ │
|
||
│ └─────────────┘ │
|
||
└─────────────────────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
### Prerequisites
|
||
|
||
Before deploying CI/CD infrastructure:
|
||
- [ ] Kubernetes cluster is running
|
||
- [ ] Ingress controller is configured
|
||
- [ ] TLS certificates are available
|
||
- [ ] DNS records configured for `gitea.bakewise.ai`
|
||
|
||
### Step 1: Deploy Gitea (Git Server + Container Registry)
|
||
|
||
Gitea provides a self-hosted Git server with built-in container registry support. The setup is fully automated - admin user and initial repository are created automatically.
|
||
|
||
#### 1.1 Create Secrets and Init Job (One Command)
|
||
|
||
The setup script creates all necessary secrets and applies the initialization job.
|
||
|
||
**For Production Deployment:**
|
||
|
||
```bash
|
||
# Generate a secure password (minimum 16 characters required for production)
|
||
export GITEA_ADMIN_PASSWORD=$(openssl rand -base64 32)
|
||
echo "Gitea Admin Password: $GITEA_ADMIN_PASSWORD"
|
||
echo "⚠️ Save this password securely - you'll need it for Tekton setup!"
|
||
|
||
# Run the setup script with --production flag
|
||
# This enforces password requirements and uses production registry URL
|
||
./infrastructure/cicd/gitea/setup-admin-secret.sh --production
|
||
```
|
||
|
||
**What the `--production` flag does:**
|
||
- Requires `GITEA_ADMIN_PASSWORD` environment variable (won't use defaults)
|
||
- Validates password is at least 16 characters
|
||
- Uses production registry URL (`registry.bakewise.ai`)
|
||
- Hides password in output for security
|
||
- Shows production-specific next steps
|
||
|
||
This creates:
|
||
- `gitea-admin-secret` in `gitea` namespace - admin credentials for Gitea
|
||
- `gitea-registry-secret` in `bakery-ia` namespace - for imagePullSecrets
|
||
- `gitea-init-job` - Kubernetes Job that creates the `bakery-ia` repository automatically
|
||
|
||
> **For dev environments only:** Run without flags to use the default static password:
|
||
> ```bash
|
||
> ./infrastructure/cicd/gitea/setup-admin-secret.sh
|
||
> ```
|
||
|
||
#### 1.2 Install Gitea via Helm
|
||
|
||
```bash
|
||
# Add Gitea Helm repository
|
||
helm repo add gitea https://dl.gitea.io/charts
|
||
helm repo update gitea
|
||
|
||
# Install Gitea with PRODUCTION values (includes TLS, proper domains, resources)
|
||
helm upgrade --install gitea gitea/gitea \
|
||
-n gitea \
|
||
-f infrastructure/cicd/gitea/values.yaml \
|
||
-f infrastructure/cicd/gitea/values-prod.yaml \
|
||
--timeout 10m \
|
||
--wait
|
||
|
||
# Wait for Gitea to be ready
|
||
kubectl wait --for=condition=ready pod -n gitea -l app.kubernetes.io/name=gitea --timeout=300s
|
||
|
||
# Verify Gitea is running
|
||
kubectl get pods -n gitea
|
||
kubectl get svc -n gitea
|
||
```
|
||
|
||
**Production values (`values-prod.yaml`) include:**
|
||
- Domain: `gitea.bakewise.ai` and `registry.bakewise.ai`
|
||
- TLS via cert-manager with Let's Encrypt production issuer
|
||
- 50Gi storage (vs 10Gi in dev)
|
||
- Increased resource limits
|
||
|
||
#### 1.3 Verify Repository Initialization
|
||
|
||
The init job automatically creates the `bakery-ia` repository once Gitea is ready:
|
||
|
||
```bash
|
||
# Check init job completed successfully
|
||
kubectl logs -n gitea job/gitea-init-repo
|
||
|
||
# Expected output:
|
||
# === Gitea Repository Initialization ===
|
||
# Gitea is ready!
|
||
# Repository 'bakery-ia' created successfully!
|
||
```
|
||
|
||
If the job needs to be re-run:
|
||
```bash
|
||
kubectl delete job gitea-init-repo -n gitea
|
||
kubectl apply -f infrastructure/cicd/gitea/gitea-init-job.yaml
|
||
```
|
||
|
||
#### 1.4 Configure DNS for Gitea
|
||
|
||
Add DNS record pointing to your VPS:
|
||
```
|
||
Type Name Value TTL
|
||
A gitea YOUR_VPS_IP Auto
|
||
```
|
||
|
||
#### 1.5 Verify Gitea Access
|
||
|
||
```bash
|
||
# Check ingress is configured
|
||
kubectl get ingress -n gitea
|
||
|
||
# Test access (after DNS propagation)
|
||
curl -I https://gitea.bakewise.ai
|
||
|
||
# Access web interface
|
||
# URL: https://gitea.bakewise.ai
|
||
# Username: bakery-admin
|
||
# Password: (from step 1.1)
|
||
|
||
# Verify repository was created via API
|
||
curl -u bakery-admin:$GITEA_ADMIN_PASSWORD \
|
||
https://gitea.bakewise.ai/api/v1/repos/bakery-admin/bakery-ia
|
||
```
|
||
|
||
#### 1.6 Push Code to Repository
|
||
|
||
The `bakery-ia` repository is already created with a README. Push your code:
|
||
|
||
```bash
|
||
# Add Gitea as remote and push code
|
||
cd /path/to/bakery-ia
|
||
git remote add gitea https://gitea.bakewise.ai/bakery-admin/bakery-ia.git
|
||
git push gitea main
|
||
```
|
||
|
||
### Step 2: Deploy Tekton Pipelines
|
||
|
||
Tekton provides cloud-native CI/CD pipelines.
|
||
|
||
#### 2.1 Install Tekton Core Components
|
||
|
||
```bash
|
||
# Create Tekton namespace
|
||
kubectl create namespace tekton-pipelines
|
||
|
||
# Install Tekton Pipelines
|
||
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
|
||
|
||
# Wait for Tekton Pipelines to be ready
|
||
kubectl wait --for=condition=available --timeout=300s \
|
||
deployment/tekton-pipelines-controller -n tekton-pipelines
|
||
kubectl wait --for=condition=available --timeout=300s \
|
||
deployment/tekton-pipelines-webhook -n tekton-pipelines
|
||
|
||
# Install Tekton Triggers (for webhook-based automation)
|
||
kubectl apply -f https://storage.googleapis.com/tekton-releases/triggers/latest/release.yaml
|
||
kubectl apply -f https://storage.googleapis.com/tekton-releases/triggers/latest/interceptors.yaml
|
||
|
||
# Wait for Tekton Triggers to be ready
|
||
kubectl wait --for=condition=available --timeout=300s \
|
||
deployment/tekton-triggers-controller -n tekton-pipelines
|
||
kubectl wait --for=condition=available --timeout=300s \
|
||
deployment/tekton-triggers-webhook -n tekton-pipelines
|
||
|
||
# Verify installation
|
||
kubectl get pods -n tekton-pipelines
|
||
```
|
||
|
||
#### 2.2 Deploy Tekton CI/CD Configuration via Helm
|
||
|
||
**For Production Deployment:**
|
||
|
||
```bash
|
||
# Generate secure webhook token (save this for Gitea webhook configuration)
|
||
export TEKTON_WEBHOOK_TOKEN=$(openssl rand -hex 32)
|
||
echo "Webhook Token: $TEKTON_WEBHOOK_TOKEN"
|
||
echo "⚠️ Save this token - you'll need it for Gitea webhook setup!"
|
||
|
||
# Ensure GITEA_ADMIN_PASSWORD is still set from Step 1
|
||
echo "Using Gitea password from: GITEA_ADMIN_PASSWORD"
|
||
|
||
# Install Tekton CI/CD with PRODUCTION values
|
||
helm upgrade --install tekton-cicd infrastructure/cicd/tekton-helm \
|
||
-n tekton-pipelines \
|
||
-f infrastructure/cicd/tekton-helm/values.yaml \
|
||
-f infrastructure/cicd/tekton-helm/values-prod.yaml \
|
||
--set secrets.webhook.token=$TEKTON_WEBHOOK_TOKEN \
|
||
--set secrets.registry.password=$GITEA_ADMIN_PASSWORD \
|
||
--set secrets.git.password=$GITEA_ADMIN_PASSWORD \
|
||
--timeout 10m \
|
||
--wait
|
||
|
||
# Verify resources created
|
||
kubectl get pipelines -n tekton-pipelines
|
||
kubectl get tasks -n tekton-pipelines
|
||
kubectl get eventlisteners -n tekton-pipelines
|
||
kubectl get triggerbindings -n tekton-pipelines
|
||
kubectl get triggertemplates -n tekton-pipelines
|
||
```
|
||
|
||
**What the production values (`values-prod.yaml`) provide:**
|
||
- Empty default secrets (must be provided via `--set` flags)
|
||
- Increased controller/webhook replicas (2 each)
|
||
- Higher resource limits for production workloads
|
||
- 10Gi workspace storage (vs 5Gi in dev)
|
||
|
||
> **⚠️ Security Note:** Never commit actual secrets to values files. Always pass them via `--set` flags or use external secret management.
|
||
|
||
#### 2.3 Configure Gitea Webhook
|
||
|
||
1. Go to Gitea repository settings → Webhooks
|
||
2. Add webhook:
|
||
- **Target URL:** `http://el-bakery-ia-listener.tekton-pipelines.svc.cluster.local:8080`
|
||
- **HTTP Method:** POST
|
||
- **Content Type:** application/json
|
||
- **Secret:** (same as `secrets.webhook.token` from Helm)
|
||
- **Trigger on:** Push events
|
||
3. Save webhook
|
||
|
||
#### 2.4 Test Pipeline Manually
|
||
|
||
```bash
|
||
# Create a manual PipelineRun to test the CI pipeline
|
||
cat <<EOF | kubectl create -f -
|
||
apiVersion: tekton.dev/v1beta1
|
||
kind: PipelineRun
|
||
metadata:
|
||
generateName: manual-ci-run-
|
||
namespace: tekton-pipelines
|
||
spec:
|
||
pipelineRef:
|
||
name: bakery-ia-ci
|
||
workspaces:
|
||
- name: shared-workspace
|
||
volumeClaimTemplate:
|
||
spec:
|
||
accessModes: ["ReadWriteOnce"]
|
||
resources:
|
||
requests:
|
||
storage: 5Gi
|
||
- name: docker-credentials
|
||
secret:
|
||
secretName: gitea-registry-credentials
|
||
params:
|
||
- name: git-url
|
||
value: "http://gitea-http.gitea.svc.cluster.local:3000/bakery-admin/bakery-ia.git"
|
||
- name: git-revision
|
||
value: "main"
|
||
EOF
|
||
|
||
# Watch pipeline progress
|
||
kubectl get pipelineruns -n tekton-pipelines -w
|
||
|
||
# View logs
|
||
tkn pipelinerun logs -n tekton-pipelines -f
|
||
```
|
||
|
||
### Step 3: Deploy Flux CD (GitOps)
|
||
|
||
Flux CD provides GitOps-based continuous deployment.
|
||
|
||
#### 3.1 Install Flux CLI
|
||
|
||
```bash
|
||
# Install Flux CLI (if not already installed)
|
||
curl -s https://fluxcd.io/install.sh | sudo bash
|
||
|
||
# Verify installation
|
||
flux --version
|
||
```
|
||
|
||
#### 3.2 Install Flux Components
|
||
|
||
```bash
|
||
# Install Flux CRDs and controllers
|
||
flux install --namespace=flux-system --network-policy=false
|
||
|
||
# Verify Flux is running
|
||
kubectl get pods -n flux-system
|
||
flux check
|
||
```
|
||
|
||
#### 3.3 Deploy Flux Configuration via Helm
|
||
|
||
```bash
|
||
# Create Flux namespace if not exists
|
||
kubectl create namespace flux-system --dry-run=client -o yaml | kubectl apply -f -
|
||
|
||
# Create Git credentials secret for Flux
|
||
kubectl create secret generic gitea-credentials \
|
||
-n flux-system \
|
||
--from-literal=username=bakery-admin \
|
||
--from-literal=password="$GITEA_ADMIN_PASSWORD"
|
||
|
||
# Install Flux configuration
|
||
helm upgrade --install flux-cd infrastructure/cicd/flux \
|
||
-n flux-system \
|
||
--timeout 10m \
|
||
--wait
|
||
|
||
# Verify Flux resources
|
||
kubectl get gitrepositories -n flux-system
|
||
kubectl get kustomizations -n flux-system
|
||
```
|
||
|
||
#### 3.4 Verify GitOps Sync
|
||
|
||
```bash
|
||
# Check GitRepository status
|
||
flux get sources git -n flux-system
|
||
|
||
# Check Kustomization status
|
||
flux get kustomizations -n flux-system
|
||
|
||
# Force reconciliation
|
||
flux reconcile source git bakery-ia -n flux-system
|
||
flux reconcile kustomization bakery-ia-prod -n flux-system
|
||
```
|
||
|
||
### Step 4: Complete CI/CD Workflow Test
|
||
|
||
Test the entire CI/CD pipeline end-to-end:
|
||
|
||
```bash
|
||
# 1. Make a small change in your local repo
|
||
echo "# CI/CD Test $(date)" >> README.md
|
||
git add README.md
|
||
git commit -m "Test CI/CD pipeline"
|
||
|
||
# 2. Push to Gitea
|
||
git push gitea main
|
||
|
||
# 3. Watch Tekton pipeline triggered by webhook
|
||
kubectl get pipelineruns -n tekton-pipelines -w
|
||
|
||
# 4. After pipeline completes, watch Flux sync
|
||
flux get kustomizations -n flux-system -w
|
||
|
||
# 5. Verify deployment updated
|
||
kubectl get deployments -n bakery-ia -o wide
|
||
```
|
||
|
||
### CI/CD Troubleshooting
|
||
|
||
#### Tekton Pipeline Fails
|
||
|
||
```bash
|
||
# View pipeline run status
|
||
kubectl get pipelineruns -n tekton-pipelines
|
||
|
||
# Get detailed logs
|
||
tkn pipelinerun describe <pipelinerun-name> -n tekton-pipelines
|
||
tkn pipelinerun logs <pipelinerun-name> -n tekton-pipelines
|
||
|
||
# Check EventListener logs (for webhook issues)
|
||
kubectl logs -n tekton-pipelines -l app.kubernetes.io/component=eventlistener
|
||
```
|
||
|
||
#### Flux Not Syncing
|
||
|
||
```bash
|
||
# Check GitRepository status
|
||
kubectl describe gitrepository bakery-ia -n flux-system
|
||
|
||
# Check Kustomization status
|
||
kubectl describe kustomization bakery-ia-prod -n flux-system
|
||
|
||
# View Flux controller logs
|
||
kubectl logs -n flux-system deployment/source-controller
|
||
kubectl logs -n flux-system deployment/kustomize-controller
|
||
|
||
# Force reconciliation
|
||
flux reconcile source git bakery-ia -n flux-system --with-source
|
||
```
|
||
|
||
#### Gitea Webhook Not Triggering
|
||
|
||
```bash
|
||
# Check webhook delivery in Gitea UI
|
||
# Settings → Webhooks → Recent Deliveries
|
||
|
||
# Verify EventListener is running
|
||
kubectl get eventlisteners -n tekton-pipelines
|
||
kubectl get svc -n tekton-pipelines | grep listener
|
||
|
||
# Check EventListener logs
|
||
kubectl logs -n tekton-pipelines -l eventlistener=bakery-ia-listener
|
||
```
|
||
|
||
### CI/CD URLs Summary
|
||
|
||
| Service | URL | Purpose |
|
||
|---------|-----|---------|
|
||
| Gitea | https://gitea.bakewise.ai | Git repository & container registry |
|
||
| Gitea Registry | https://gitea.bakewise.ai/v2/ | Docker registry API |
|
||
| Tekton Dashboard | (install separately if needed) | Pipeline visualization |
|
||
| Flux | CLI only | GitOps status via `flux` commands |
|
||
|
||
### CI/CD Security Considerations
|
||
|
||
The CI/CD infrastructure has been configured with production security in mind:
|
||
|
||
#### Secrets Management
|
||
|
||
| Secret | Purpose | How to Generate |
|
||
|--------|---------|-----------------|
|
||
| `GITEA_ADMIN_PASSWORD` | Gitea admin & registry auth | `openssl rand -base64 32` |
|
||
| `TEKTON_WEBHOOK_TOKEN` | Webhook signature validation | `openssl rand -hex 32` |
|
||
|
||
#### Security Features
|
||
|
||
1. **Production Mode Enforcement**
|
||
- The `--production` flag on `setup-admin-secret.sh` enforces:
|
||
- Mandatory `GITEA_ADMIN_PASSWORD` environment variable
|
||
- Minimum 16-character password requirement
|
||
- Password hidden from terminal output
|
||
|
||
2. **Internal Cluster Communication**
|
||
- All CI/CD components communicate via internal cluster DNS
|
||
- GitOps updates use `gitea-http.gitea.svc.cluster.local:3000`
|
||
- No hardcoded external URLs in pipeline tasks
|
||
|
||
3. **Credential Isolation**
|
||
- Secrets are passed via `--set` flags, never committed to git
|
||
- Registry credentials are scoped per-namespace
|
||
- Webhook tokens are unique per installation
|
||
|
||
#### Post-Deployment Security Checklist
|
||
|
||
```bash
|
||
# Verify no default passwords in use
|
||
kubectl get secret gitea-admin-secret -n gitea -o jsonpath='{.data.password}' | base64 -d | wc -c
|
||
# Should be 32+ characters for production
|
||
|
||
# Verify webhook secret is set
|
||
kubectl get secret gitea-webhook-secret -n tekton-pipelines -o jsonpath='{.data.secretToken}' | base64 -d | wc -c
|
||
# Should be 64 characters (hex-encoded 32 bytes)
|
||
|
||
# Verify no hardcoded URLs in tasks
|
||
kubectl get task update-gitops -n tekton-pipelines -o yaml | grep -c "bakery-ia.local"
|
||
# Should be 0
|
||
```
|
||
|
||
---
|
||
|
||
## Mailu Email Server Deployment
|
||
|
||
Mailu is a full-featured, self-hosted email server with built-in antispam, webmail, and admin panel. **Outbound emails are relayed through Mailgun** for improved deliverability and to avoid IP reputation issues.
|
||
|
||
### Prerequisites
|
||
|
||
Before deploying Mailu:
|
||
- [ ] Unbound DNS resolver deployed (for DNSSEC validation)
|
||
- [ ] DNS records configured for mail domain
|
||
- [ ] TLS certificates available
|
||
- [ ] Mailgun account created and domain verified (for outbound email relay)
|
||
|
||
### Step 1: Deploy Unbound DNS Resolver
|
||
|
||
Mailu requires DNSSEC validation for email authentication (DKIM/SPF/DMARC).
|
||
|
||
```bash
|
||
# Deploy Unbound via Helm
|
||
helm upgrade --install unbound infrastructure/platform/networking/dns/unbound-helm \
|
||
-n bakery-ia \
|
||
--create-namespace \
|
||
-f infrastructure/platform/networking/dns/unbound-helm/values.yaml \
|
||
-f infrastructure/platform/networking/dns/unbound-helm/prod/values.yaml \
|
||
--timeout 5m \
|
||
--wait
|
||
|
||
# Verify Unbound is running
|
||
kubectl get pods -n bakery-ia | grep unbound
|
||
|
||
# Get Unbound service IP
|
||
UNBOUND_IP=$(kubectl get svc unbound-dns -n bakery-ia -o jsonpath='{.spec.clusterIP}')
|
||
echo "Unbound DNS IP: $UNBOUND_IP"
|
||
```
|
||
|
||
### Step 2: Configure CoreDNS for DNSSEC
|
||
|
||
```bash
|
||
# Get Unbound IP
|
||
UNBOUND_IP=$(kubectl get svc unbound-dns -n bakery-ia -o jsonpath='{.spec.clusterIP}')
|
||
|
||
# Create updated CoreDNS ConfigMap
|
||
cat > /tmp/coredns-config.yaml <<EOF
|
||
apiVersion: v1
|
||
kind: ConfigMap
|
||
metadata:
|
||
name: coredns
|
||
namespace: kube-system
|
||
data:
|
||
Corefile: |
|
||
.:53 {
|
||
errors
|
||
health {
|
||
lameduck 5s
|
||
}
|
||
ready
|
||
kubernetes cluster.local in-addr.arpa ip6.arpa {
|
||
pods insecure
|
||
fallthrough in-addr.arpa ip6.arpa
|
||
ttl 30
|
||
}
|
||
prometheus :9153
|
||
forward . $UNBOUND_IP {
|
||
max_concurrent 1000
|
||
}
|
||
cache 30 {
|
||
disable success cluster.local
|
||
disable denial cluster.local
|
||
}
|
||
loop
|
||
reload
|
||
loadbalance
|
||
}
|
||
EOF
|
||
|
||
# Apply configuration
|
||
kubectl apply -f /tmp/coredns-config.yaml
|
||
|
||
# Restart CoreDNS
|
||
kubectl rollout restart deployment coredns -n kube-system
|
||
kubectl rollout status deployment coredns -n kube-system --timeout=60s
|
||
|
||
# Verify DNSSEC is working
|
||
kubectl run -it --rm dns-test --image=alpine --restart=Never -- \
|
||
sh -c "apk add drill && drill -D google.com"
|
||
# Look for "ad" flag (authenticated data) in output
|
||
```
|
||
|
||
### Step 3: Configure Mailgun (External SMTP Relay)
|
||
|
||
Mailu uses Mailgun as an external SMTP relay for all outbound emails. This improves deliverability and avoids IP reputation issues common with self-hosted mail servers.
|
||
|
||
#### 3.1: Create Mailgun Account
|
||
|
||
1. Go to [https://www.mailgun.com](https://www.mailgun.com) and create an account
|
||
2. Add your domain (bakewise.ai) in the Mailgun dashboard
|
||
3. Verify domain ownership by adding the DNS records Mailgun provides
|
||
|
||
#### 3.2: Get SMTP Credentials
|
||
|
||
1. In Mailgun dashboard, go to **Domain Settings > SMTP credentials**
|
||
2. Note your credentials:
|
||
- **SMTP hostname:** `smtp.mailgun.org`
|
||
- **Port:** `587` (TLS/STARTTLS)
|
||
- **Username:** typically `postmaster@bakewise.ai`
|
||
- **Password:** your Mailgun SMTP password (NOT the API key)
|
||
|
||
#### 3.3: Create Kubernetes Secret for Mailgun
|
||
|
||
```bash
|
||
# Edit the secret template with your Mailgun credentials
|
||
nano infrastructure/platform/mail/mailu-helm/configs/mailgun-credentials-secret.yaml
|
||
|
||
# Replace the placeholder values:
|
||
# RELAY_USERNAME: "postmaster@bakewise.ai"
|
||
# RELAY_PASSWORD: "your-mailgun-smtp-password"
|
||
|
||
# Apply the secret
|
||
kubectl apply -f infrastructure/platform/mail/mailu-helm/configs/mailgun-credentials-secret.yaml -n bakery-ia
|
||
|
||
# Verify secret created
|
||
kubectl get secret mailu-mailgun-credentials -n bakery-ia
|
||
```
|
||
|
||
### Step 4: Configure DNS Records for Mail
|
||
|
||
Add these DNS records for your domain (e.g., bakewise.ai):
|
||
|
||
```
|
||
Type Name Value TTL Priority
|
||
A mail YOUR_VPS_IP Auto -
|
||
MX @ mail.bakewise.ai Auto 10
|
||
TXT @ v=spf1 include:mailgun.org mx a ~all Auto -
|
||
TXT _dmarc v=DMARC1; p=quarantine; rua=... Auto -
|
||
```
|
||
|
||
**Mailgun-specific DNS records** (Mailgun will provide exact values):
|
||
```
|
||
Type Name Value TTL
|
||
TXT (provided by Mailgun) (DKIM key from Mailgun) Auto
|
||
TXT (provided by Mailgun) (DKIM key from Mailgun) Auto
|
||
```
|
||
|
||
**Note:**
|
||
- The SPF record includes `mailgun.org` to authorize Mailgun to send on your behalf
|
||
- Add the DKIM records exactly as Mailgun provides them
|
||
- Mailu's own DKIM record will be added after deployment (Step 9)
|
||
|
||
### Step 5: Create TLS Certificate Secret
|
||
|
||
```bash
|
||
# Generate self-signed certificate for internal Mailu use
|
||
# (Ingress handles external TLS termination)
|
||
TEMP_DIR=$(mktemp -d)
|
||
cd "$TEMP_DIR"
|
||
|
||
openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
|
||
-keyout tls.key -out tls.crt \
|
||
-subj "/CN=mail.bakewise.ai/O=bakewise"
|
||
|
||
kubectl create secret tls mailu-certificates \
|
||
--cert=tls.crt \
|
||
--key=tls.key \
|
||
-n bakery-ia
|
||
|
||
rm -rf "$TEMP_DIR"
|
||
|
||
# Verify secret created
|
||
kubectl get secret mailu-certificates -n bakery-ia
|
||
```
|
||
|
||
### Step 6: Create Admin Credentials Secret
|
||
|
||
The admin account is created automatically during Helm deployment using the `initialAccount` feature. Create a secret with the admin password before deploying.
|
||
|
||
```bash
|
||
# Generate a secure password (or use your own)
|
||
ADMIN_PASSWORD=$(openssl rand -base64 16 | tr -d '/+=' | head -c 16)
|
||
echo "Admin password: $ADMIN_PASSWORD"
|
||
echo "SAVE THIS PASSWORD SECURELY!"
|
||
|
||
# Create the admin credentials secret
|
||
kubectl create secret generic mailu-admin-credentials \
|
||
--from-literal=password="$ADMIN_PASSWORD" \
|
||
-n bakery-ia
|
||
|
||
# Verify secret created
|
||
kubectl get secret mailu-admin-credentials -n bakery-ia
|
||
```
|
||
|
||
**Alternative:** Use the provided template file:
|
||
```bash
|
||
# Edit the secret template with your password (base64 encoded)
|
||
nano infrastructure/platform/mail/mailu-helm/configs/mailu-admin-credentials-secret.yaml
|
||
|
||
# Apply the secret
|
||
kubectl apply -f infrastructure/platform/mail/mailu-helm/configs/mailu-admin-credentials-secret.yaml
|
||
```
|
||
|
||
### Step 7: Deploy Mailu via Helm
|
||
|
||
```bash
|
||
# Add Mailu Helm repository
|
||
helm repo add mailu https://mailu.github.io/helm-charts
|
||
helm repo update mailu
|
||
|
||
# Deploy Mailu with production values
|
||
# Note:
|
||
# - externalRelay uses Mailgun via the secret created in Step 3
|
||
# - initialAccount creates admin user automatically using the secret from Step 6
|
||
helm upgrade --install mailu mailu/mailu \
|
||
-n bakery-ia \
|
||
--create-namespace \
|
||
-f infrastructure/platform/mail/mailu-helm/values.yaml \
|
||
-f infrastructure/platform/mail/mailu-helm/prod/values.yaml \
|
||
--timeout 10m
|
||
|
||
# Wait for pods to be ready (ClamAV may take 5-10 minutes)
|
||
kubectl get pods -n bakery-ia -l app.kubernetes.io/instance=mailu -w
|
||
|
||
# The admin user (admin@bakewise.ai) is created automatically!
|
||
```
|
||
|
||
### Step 8: Apply Mailu Ingress
|
||
|
||
```bash
|
||
# Apply Mailu-specific ingress configuration
|
||
kubectl apply -f infrastructure/platform/mail/mailu-helm/mailu-ingress.yaml
|
||
|
||
# Verify ingress
|
||
kubectl get ingress -n bakery-ia | grep mailu
|
||
```
|
||
|
||
**Admin Credentials (created automatically in Step 7):**
|
||
- **Email:** `admin@bakewise.ai`
|
||
- **Password:** The password you set in Step 6 (stored in `mailu-admin-credentials` secret)
|
||
|
||
To retrieve the password later:
|
||
```bash
|
||
kubectl get secret mailu-admin-credentials -n bakery-ia -o jsonpath='{.data.password}' | base64 -d
|
||
```
|
||
|
||
### Step 9: Configure DKIM
|
||
|
||
```bash
|
||
# Get DKIM public key from Mailu
|
||
kubectl exec -n bakery-ia deployment/mailu-admin -- \
|
||
cat /dkim/bakewise.ai.dkim.pub
|
||
|
||
# Add DKIM record to DNS:
|
||
# Type: TXT
|
||
# Name: dkim._domainkey
|
||
# Value: (output from above command)
|
||
```
|
||
|
||
### Step 10: Verify Email Setup
|
||
|
||
```bash
|
||
# Check all Mailu pods are running
|
||
kubectl get pods -n bakery-ia | grep mailu
|
||
# Expected: All pods in Running state
|
||
|
||
# Verify Mailgun secret is configured
|
||
kubectl get secret mailu-mailgun-credentials -n bakery-ia
|
||
kubectl get secret mailu-mailgun-credentials -n bakery-ia -o jsonpath='{.data.RELAY_USERNAME}' | base64 -d
|
||
# Should show: postmaster@bakewise.ai
|
||
|
||
# Test internal SMTP connectivity
|
||
kubectl run -it --rm smtp-test --image=alpine --restart=Never -- \
|
||
sh -c "apk add swaks && swaks --to test@example.com --from admin@bakewise.ai --server mailu-front.bakery-ia.svc.cluster.local:25"
|
||
|
||
# Test outbound email via Mailgun relay (send test email)
|
||
kubectl exec -it -n bakery-ia deployment/mailu-admin -- \
|
||
flask mailu alias_create test bakewise.ai 'your-personal-email@gmail.com'
|
||
# Then send a test email from webmail to your personal email
|
||
|
||
# Access webmail (via port-forward for testing)
|
||
kubectl port-forward -n bakery-ia svc/mailu-front 8080:80
|
||
# Open: http://localhost:8080/webmail
|
||
```
|
||
|
||
### Mailu Endpoints
|
||
|
||
| Service | URL/Address |
|
||
|---------|-------------|
|
||
| Admin Panel | https://mail.bakewise.ai/admin |
|
||
| Webmail | https://mail.bakewise.ai/webmail |
|
||
| SMTP (STARTTLS) | mail.bakewise.ai:587 |
|
||
| SMTP (SSL) | mail.bakewise.ai:465 |
|
||
| IMAP (SSL) | mail.bakewise.ai:993 |
|
||
|
||
### Mailu Troubleshooting
|
||
|
||
#### Admin Pod CrashLoopBackOff with DNSSEC Error
|
||
|
||
```bash
|
||
# Verify CoreDNS is forwarding to Unbound
|
||
kubectl get configmap coredns -n kube-system -o yaml | grep forward
|
||
# Should show: forward . <unbound-ip>
|
||
|
||
# If not configured, re-run Step 2
|
||
```
|
||
|
||
#### Front Pod Stuck in ContainerCreating
|
||
|
||
```bash
|
||
# Check for missing certificate secret
|
||
kubectl describe pod -n bakery-ia -l app.kubernetes.io/component=front | grep -A5 Events
|
||
|
||
# If missing mailu-certificates, re-run Step 4
|
||
```
|
||
|
||
#### Cannot Connect to Redis
|
||
|
||
```bash
|
||
# Verify internal Redis is enabled (not external)
|
||
helm get values mailu -n bakery-ia | grep -A5 externalRedis
|
||
# Should show: enabled: false
|
||
|
||
# If enabled: true, upgrade with correct values
|
||
helm upgrade mailu mailu/mailu -n bakery-ia \
|
||
-f infrastructure/platform/mail/mailu-helm/values.yaml \
|
||
-f infrastructure/platform/mail/mailu-helm/prod/values.yaml
|
||
```
|
||
|
||
#### Outbound Emails Not Delivered (Mailgun Relay Issues)
|
||
|
||
```bash
|
||
# Check if Mailgun credentials secret exists
|
||
kubectl get secret mailu-mailgun-credentials -n bakery-ia
|
||
# If missing, create it (see Step 3)
|
||
|
||
# Verify credentials are set correctly
|
||
kubectl get secret mailu-mailgun-credentials -n bakery-ia -o jsonpath='{.data.RELAY_USERNAME}' | base64 -d
|
||
# Should show your Mailgun username (e.g., postmaster@bakewise.ai)
|
||
|
||
# Check Postfix logs for relay errors
|
||
kubectl logs -n bakery-ia deployment/mailu-postfix | grep -i "relay\|mailgun\|sasl"
|
||
# Look for authentication errors or connection failures
|
||
|
||
# Verify Mailgun domain is verified
|
||
# Go to Mailgun dashboard > Domain Settings > DNS Records
|
||
# All records should show "Verified" status
|
||
|
||
# Test Mailgun SMTP connectivity directly
|
||
kubectl run -it --rm mailgun-test --image=alpine --restart=Never -- \
|
||
sh -c "apk add swaks && swaks --to test@example.com --from postmaster@bakewise.ai \
|
||
--server smtp.mailgun.org:587 --tls \
|
||
--auth-user 'postmaster@bakewise.ai' \
|
||
--auth-password 'YOUR_MAILGUN_PASSWORD'"
|
||
```
|
||
|
||
#### Emails Going to Spam
|
||
|
||
1. Verify SPF record includes Mailgun: `v=spf1 include:mailgun.org mx a ~all`
|
||
2. Check DKIM records are properly configured in both Mailgun and Mailu
|
||
3. Verify DMARC record is set
|
||
4. Check your domain reputation at [mail-tester.com](https://www.mail-tester.com)
|
||
|
||
---
|
||
|
||
## Nominatim Geocoding Service
|
||
|
||
Nominatim provides geocoding (address to coordinates) and reverse geocoding for delivery and distribution features.
|
||
|
||
### When to Deploy
|
||
|
||
Deploy Nominatim if you need:
|
||
- Address autocomplete in the frontend
|
||
- Delivery route optimization
|
||
- Location-based analytics
|
||
|
||
### Step 1: Deploy Nominatim via Helm
|
||
|
||
```bash
|
||
# Deploy Nominatim with production values
|
||
helm upgrade --install nominatim infrastructure/platform/nominatim/nominatim-helm \
|
||
-n bakery-ia \
|
||
--create-namespace \
|
||
-f infrastructure/platform/nominatim/nominatim-helm/values.yaml \
|
||
-f infrastructure/platform/nominatim/nominatim-helm/prod/values.yaml \
|
||
--timeout 15m \
|
||
--wait
|
||
|
||
# Verify deployment
|
||
kubectl get pods -n bakery-ia | grep nominatim
|
||
```
|
||
|
||
**Note:** Initial deployment may take 10-15 minutes as Nominatim downloads and processes geographic data.
|
||
|
||
### Step 2: Verify Nominatim Service
|
||
|
||
```bash
|
||
# Check pod status
|
||
kubectl get pods -n bakery-ia -l app=nominatim
|
||
|
||
# Check service
|
||
kubectl get svc -n bakery-ia | grep nominatim
|
||
|
||
# Test geocoding endpoint
|
||
kubectl run -it --rm curl-test --image=curlimages/curl --restart=Never -- \
|
||
curl "http://nominatim-service.bakery-ia.svc.cluster.local:8080/search?q=Madrid&format=json"
|
||
```
|
||
|
||
### Step 3: Configure Application to Use Nominatim
|
||
|
||
Update the application ConfigMap to use the internal Nominatim service:
|
||
|
||
```bash
|
||
# Edit configmap
|
||
kubectl edit configmap bakery-ia-config -n bakery-ia
|
||
|
||
# Set:
|
||
# NOMINATIM_URL: "http://nominatim-service.bakery-ia.svc.cluster.local:8080"
|
||
```
|
||
|
||
### Nominatim Service Information
|
||
|
||
| Property | Value |
|
||
|----------|-------|
|
||
| Service Name | nominatim-service.bakery-ia.svc.cluster.local |
|
||
| Port | 8080 |
|
||
| Health Check | http://nominatim-service:8080/status |
|
||
| Search Endpoint | /search?q={query}&format=json |
|
||
| Reverse Endpoint | /reverse?lat={lat}&lon={lon}&format=json |
|
||
|
||
---
|
||
|
||
## SigNoz Monitoring Deployment
|
||
|
||
SigNoz provides unified observability (traces, metrics, logs) for the entire platform.
|
||
|
||
### Step 1: Deploy SigNoz via Helm
|
||
|
||
```bash
|
||
# Add SigNoz Helm repository
|
||
helm repo add signoz https://charts.signoz.io
|
||
helm repo update signoz
|
||
|
||
# Deploy SigNoz into bakery-ia namespace
|
||
helm upgrade --install signoz signoz/signoz \
|
||
-n bakery-ia \
|
||
-f infrastructure/monitoring/signoz/signoz-values-prod.yaml \
|
||
--set frontend.service.type=ClusterIP \
|
||
--set clickhouse.persistence.size=20Gi \
|
||
--set clickhouse.persistence.storageClass=microk8s-hostpath \
|
||
--timeout 15m \
|
||
--wait
|
||
|
||
# Wait for all components (may take 10-15 minutes)
|
||
kubectl wait --for=condition=ready pod \
|
||
-l app.kubernetes.io/instance=signoz \
|
||
-n bakery-ia \
|
||
--timeout=900s
|
||
|
||
# Verify deployment
|
||
kubectl get pods -n bakery-ia -l app.kubernetes.io/instance=signoz
|
||
```
|
||
|
||
### Step 2: Configure Ingress for SigNoz
|
||
|
||
```bash
|
||
# Apply SigNoz ingress (if not already included in overlays)
|
||
cat <<EOF | kubectl apply -f -
|
||
apiVersion: networking.k8s.io/v1
|
||
kind: Ingress
|
||
metadata:
|
||
name: signoz-ingress
|
||
namespace: bakery-ia
|
||
annotations:
|
||
cert-manager.io/cluster-issuer: letsencrypt-production
|
||
nginx.ingress.kubernetes.io/ssl-redirect: "true"
|
||
spec:
|
||
ingressClassName: public
|
||
tls:
|
||
- hosts:
|
||
- monitoring.bakewise.ai
|
||
secretName: signoz-tls-cert
|
||
rules:
|
||
- host: monitoring.bakewise.ai
|
||
http:
|
||
paths:
|
||
- path: /
|
||
pathType: Prefix
|
||
backend:
|
||
service:
|
||
name: signoz-frontend
|
||
port:
|
||
number: 3301
|
||
EOF
|
||
```
|
||
|
||
### Step 3: Verify SigNoz Access
|
||
|
||
```bash
|
||
# Check ingress
|
||
kubectl get ingress -n bakery-ia | grep signoz
|
||
|
||
# Test access
|
||
curl -I https://monitoring.bakewise.ai
|
||
|
||
# Access SigNoz UI
|
||
# URL: https://monitoring.bakewise.ai
|
||
# Default credentials: admin / admin (change after first login)
|
||
```
|
||
|
||
### SigNoz Endpoints
|
||
|
||
| Service | URL |
|
||
|---------|-----|
|
||
| SigNoz UI | https://monitoring.bakewise.ai |
|
||
| AlertManager | https://monitoring.bakewise.ai/alertmanager |
|
||
| OTel Collector (gRPC) | signoz-otel-collector:4317 (internal) |
|
||
| OTel Collector (HTTP) | signoz-otel-collector:4318 (internal) |
|
||
|
||
---
|
||
|
||
## Verification & Testing
|
||
|
||
### Step 1: Check All Pods Running
|
||
|
||
```bash
|
||
# View all pods
|
||
kubectl get pods -n bakery-ia
|
||
|
||
# Expected: All pods in "Running" state, none in CrashLoopBackOff
|
||
|
||
# Check for issues
|
||
kubectl get pods -n bakery-ia | grep -vE "Running|Completed"
|
||
|
||
# View logs for any problematic pods
|
||
kubectl logs -n bakery-ia POD_NAME
|
||
```
|
||
|
||
### Step 2: Check Services and Ingress
|
||
|
||
```bash
|
||
# View services
|
||
kubectl get svc -n bakery-ia
|
||
|
||
# View ingress
|
||
kubectl get ingress -n bakery-ia
|
||
|
||
# View certificates (should auto-issue from Let's Encrypt)
|
||
kubectl get certificate -n bakery-ia
|
||
|
||
# Describe certificate to check status
|
||
kubectl describe certificate bakery-ia-prod-tls-cert -n bakery-ia
|
||
```
|
||
|
||
### Step 3: Test Database Connections
|
||
|
||
```bash
|
||
# Test PostgreSQL TLS
|
||
kubectl exec -n bakery-ia deployment/auth-db -- sh -c \
|
||
'psql -U auth_user -d auth_db -c "SHOW ssl;"'
|
||
# Expected output: on
|
||
|
||
# Test Redis TLS
|
||
kubectl exec -n bakery-ia deployment/redis -- redis-cli \
|
||
--tls \
|
||
--cert /tls/redis-cert.pem \
|
||
--key /tls/redis-key.pem \
|
||
--cacert /tls/ca-cert.pem \
|
||
-a $REDIS_PASSWORD \
|
||
ping
|
||
# Expected output: PONG
|
||
```
|
||
|
||
### Step 4: Test Frontend Access
|
||
|
||
```bash
|
||
# Test frontend (replace with your domain)
|
||
curl -I https://bakery.yourdomain.com
|
||
|
||
# Expected: HTTP/2 200 OK
|
||
|
||
# Test API health
|
||
curl https://api.yourdomain.com/health
|
||
|
||
# Expected: {"status": "healthy"}
|
||
```
|
||
|
||
### Step 5: Test Authentication
|
||
|
||
```bash
|
||
# Create a test user (using your frontend or API)
|
||
curl -X POST https://api.yourdomain.com/api/v1/auth/register \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"email": "test@yourdomain.com",
|
||
"password": "TestPassword123!",
|
||
"name": "Test User"
|
||
}'
|
||
|
||
# Login
|
||
curl -X POST https://api.yourdomain.com/api/v1/auth/login \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"email": "test@yourdomain.com",
|
||
"password": "TestPassword123!"
|
||
}'
|
||
|
||
# Expected: JWT token in response
|
||
```
|
||
|
||
### Step 6: Test Email Delivery
|
||
|
||
```bash
|
||
# Trigger a password reset to test email
|
||
curl -X POST https://api.yourdomain.com/api/v1/auth/forgot-password \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"email": "test@yourdomain.com"}'
|
||
|
||
# Check your email inbox for the reset link
|
||
# Check service logs if email not received:
|
||
kubectl logs -n bakery-ia deployment/auth-service | grep -i "email\|smtp"
|
||
```
|
||
|
||
### Step 7: Test WhatsApp (Optional)
|
||
|
||
```bash
|
||
# Send a test WhatsApp message
|
||
# This requires creating a tenant and configuring WhatsApp in the UI
|
||
# Or test via API once authenticated
|
||
```
|
||
|
||
---
|
||
|
||
## Post-Deployment
|
||
|
||
### Step 1: Access SigNoz Monitoring Stack
|
||
|
||
Your production deployment includes **SigNoz**, a unified observability platform that provides complete visibility into your application:
|
||
|
||
#### What is SigNoz?
|
||
|
||
SigNoz is an **open-source, all-in-one observability platform** that provides:
|
||
- **📊 Distributed Tracing** - See end-to-end request flows across all 18 microservices
|
||
- **📈 Metrics Monitoring** - Application performance and infrastructure metrics
|
||
- **📝 Log Management** - Centralized logs from all services with trace correlation
|
||
- **🔍 Service Performance Monitoring (SPM)** - Automatic RED metrics (Rate, Error, Duration)
|
||
- **🗄️ Database Monitoring** - All 18 PostgreSQL databases + Redis + RabbitMQ
|
||
- **☸️ Kubernetes Monitoring** - Cluster, node, pod, and container metrics
|
||
|
||
**Why SigNoz instead of Prometheus/Grafana?**
|
||
- Single unified UI for traces, metrics, and logs (no context switching)
|
||
- Automatic service dependency mapping
|
||
- Built-in APM (Application Performance Monitoring)
|
||
- Log-trace correlation with one click
|
||
- Better query performance with ClickHouse backend
|
||
- Modern UI designed for microservices
|
||
|
||
#### Production Monitoring URLs
|
||
|
||
Access via domain:
|
||
```
|
||
https://monitoring.bakewise.ai/signoz # SigNoz - Main observability UI
|
||
https://monitoring.bakewise.ai/alertmanager # AlertManager - Alert management
|
||
```
|
||
|
||
Or via port forwarding (if needed):
|
||
```bash
|
||
# SigNoz Frontend (Main UI)
|
||
kubectl port-forward -n bakery-ia svc/signoz 8080:8080 &
|
||
# Open: http://localhost:8080
|
||
|
||
# SigNoz AlertManager
|
||
kubectl port-forward -n bakery-ia svc/signoz-alertmanager 9093:9093 &
|
||
# Open: http://localhost:9093
|
||
|
||
# OTel Collector (for debugging)
|
||
kubectl port-forward -n bakery-ia svc/signoz-otel-collector 4317:4317 & # gRPC
|
||
kubectl port-forward -n bakery-ia svc/signoz-otel-collector 4318:4318 & # HTTP
|
||
```
|
||
|
||
#### Key SigNoz Features to Explore
|
||
|
||
Once you open SigNoz (https://monitoring.bakewise.ai/signoz), explore these tabs:
|
||
|
||
**1. Services Tab - Application Performance**
|
||
- View all 18 microservices with live metrics
|
||
- See request rate, error rate, and latency (P50/P90/P99)
|
||
- Click on any service to drill down into operations
|
||
- Identify slow endpoints and error-prone operations
|
||
|
||
**2. Traces Tab - Request Flow Visualization**
|
||
- See complete request journeys across services
|
||
- Identify bottlenecks (slow database queries, API calls)
|
||
- Debug errors with full stack traces
|
||
- Correlate with logs for complete context
|
||
|
||
**3. Dashboards Tab - Infrastructure & Database Metrics**
|
||
- **PostgreSQL** - Monitor all 18 databases (connections, queries, cache hit ratio)
|
||
- **Redis** - Cache performance (memory, hit rate, commands/sec)
|
||
- **RabbitMQ** - Message queue health (depth, rates, consumers)
|
||
- **Kubernetes** - Cluster metrics (nodes, pods, containers)
|
||
|
||
**4. Logs Tab - Centralized Log Management**
|
||
- Search and filter logs from all services
|
||
- Click on trace ID in logs to see related request trace
|
||
- Auto-enriched with Kubernetes metadata (pod, namespace, container)
|
||
- Identify patterns and anomalies
|
||
|
||
**5. Alerts Tab - Proactive Monitoring**
|
||
- Configure alerts on metrics, traces, or logs
|
||
- Email/Slack/Webhook notifications
|
||
- View firing alerts and alert history
|
||
|
||
#### Quick Health Check
|
||
|
||
```bash
|
||
# Verify SigNoz components are running
|
||
kubectl get pods -n bakery-ia -l app.kubernetes.io/instance=signoz
|
||
|
||
# Expected output:
|
||
# signoz-0 READY 1/1
|
||
# signoz-otel-collector-xxx READY 1/1
|
||
# signoz-alertmanager-xxx READY 1/1
|
||
# signoz-clickhouse-xxx READY 1/1
|
||
# signoz-zookeeper-xxx READY 1/1
|
||
|
||
# Check OTel Collector health
|
||
kubectl exec -n bakery-ia deployment/signoz-otel-collector -- wget -qO- http://localhost:13133
|
||
|
||
# View recent telemetry in OTel Collector logs
|
||
kubectl logs -n bakery-ia deployment/signoz-otel-collector --tail=50 | grep -i "traces\|metrics\|logs"
|
||
```
|
||
|
||
#### Verify Telemetry is Working
|
||
|
||
1. **Check Services are Reporting:**
|
||
```bash
|
||
# Open SigNoz and navigate to Services tab
|
||
# You should see all 18 microservices listed
|
||
|
||
# If services are missing, check if they're sending telemetry:
|
||
kubectl logs -n bakery-ia deployment/auth-service | grep -i "telemetry\|otel"
|
||
```
|
||
|
||
2. **Check Database Metrics:**
|
||
```bash
|
||
# Navigate to Dashboards → PostgreSQL in SigNoz
|
||
# You should see metrics from all 18 databases
|
||
|
||
# Verify OTel Collector is scraping databases:
|
||
kubectl logs -n bakery-ia deployment/signoz-otel-collector | grep postgresql
|
||
```
|
||
|
||
3. **Check Traces are Being Collected:**
|
||
```bash
|
||
# Make a test API request
|
||
curl https://bakewise.ai/api/v1/health
|
||
|
||
# Navigate to Traces tab in SigNoz
|
||
# Search for "gateway" service
|
||
# You should see the trace for your request
|
||
```
|
||
|
||
4. **Check Logs are Being Collected:**
|
||
```bash
|
||
# Navigate to Logs tab in SigNoz
|
||
# Filter by namespace: bakery-ia
|
||
# You should see logs from all pods
|
||
|
||
# Verify filelog receiver is working:
|
||
kubectl logs -n bakery-ia deployment/signoz-otel-collector | grep filelog
|
||
```
|
||
|
||
### Step 2: Configure CI/CD Infrastructure (Optional but Recommended)
|
||
|
||
If you deployed the CI/CD infrastructure, configure it for your workflow:
|
||
|
||
#### Gitea Setup (Git Server + Registry)
|
||
```bash
|
||
# Access Gitea at: http://gitea.bakery-ia.local (for dev) or http://gitea.bakewise.ai (for prod)
|
||
# Make sure to add the appropriate hostname to /etc/hosts or configure DNS
|
||
|
||
# Create your repositories for each service
|
||
# Configure webhook to trigger Tekton pipelines
|
||
```
|
||
|
||
#### Tekton Pipeline Configuration
|
||
```bash
|
||
# Verify Tekton pipelines are running
|
||
kubectl get pods -n tekton-pipelines
|
||
|
||
# Create a PipelineRun manually to test:
|
||
kubectl create -f - <<EOF
|
||
apiVersion: tekton.dev/v1beta1
|
||
kind: PipelineRun
|
||
metadata:
|
||
name: manual-ci-run
|
||
namespace: tekton-pipelines
|
||
spec:
|
||
pipelineRef:
|
||
name: bakery-ia-ci
|
||
workspaces:
|
||
- name: shared-workspace
|
||
volumeClaimTemplate:
|
||
spec:
|
||
accessModes: ["ReadWriteOnce"]
|
||
resources:
|
||
requests:
|
||
storage: 5Gi
|
||
- name: docker-credentials
|
||
secret:
|
||
secretName: gitea-registry-credentials
|
||
params:
|
||
- name: git-url
|
||
value: "http://gitea.bakery-ia.local/bakery-admin/bakery-ia.git"
|
||
- name: git-revision
|
||
value: "main"
|
||
EOF
|
||
```
|
||
|
||
#### Flux CD Configuration (GitOps)
|
||
```bash
|
||
# Verify Flux is running
|
||
kubectl get pods -n flux-system
|
||
|
||
# Set up GitRepository and Kustomization resources for GitOps deployment
|
||
# Example:
|
||
cat <<EOF | kubectl apply -f -
|
||
apiVersion: source.toolkit.fluxcd.io/v1
|
||
kind: GitRepository
|
||
metadata:
|
||
name: bakery-ia
|
||
namespace: flux-system
|
||
spec:
|
||
interval: 1m
|
||
url: https://github.com/your-org/bakery-ia.git
|
||
ref:
|
||
branch: main
|
||
---
|
||
apiVersion: kustomize.toolkit.fluxcd.io/v1
|
||
kind: Kustomization
|
||
metadata:
|
||
name: bakery-ia
|
||
namespace: flux-system
|
||
spec:
|
||
interval: 5m
|
||
sourceRef:
|
||
kind: GitRepository
|
||
name: bakery-ia
|
||
path: ./infrastructure/environments/prod/k8s-manifests
|
||
prune: true
|
||
validation: client
|
||
EOF
|
||
```
|
||
|
||
### Step 2: Configure Alerting
|
||
|
||
SigNoz includes integrated alerting with AlertManager. Configure it for your team:
|
||
|
||
#### Update Email Notification Settings
|
||
|
||
The alerting configuration is in the SigNoz Helm values. To update:
|
||
|
||
```bash
|
||
# For production, edit the values file:
|
||
nano infrastructure/helm/signoz-values-prod.yaml
|
||
|
||
# Update the alertmanager.config section:
|
||
# 1. Update SMTP settings:
|
||
# - smtp_from: 'your-alerts@bakewise.ai'
|
||
# - smtp_auth_username: 'your-alerts@bakewise.ai'
|
||
# - smtp_auth_password: (use Kubernetes secret)
|
||
#
|
||
# 2. Update receivers:
|
||
# - critical-alerts email: critical-alerts@bakewise.ai
|
||
# - warning-alerts email: oncall@bakewise.ai
|
||
#
|
||
# 3. (Optional) Add Slack webhook for critical alerts
|
||
|
||
# Apply the updated configuration:
|
||
helm upgrade signoz signoz/signoz \
|
||
-n bakery-ia \
|
||
-f infrastructure/helm/signoz-values-prod.yaml
|
||
```
|
||
|
||
#### Create Alerts in SigNoz UI
|
||
|
||
1. **Open SigNoz Alerts Tab:**
|
||
```
|
||
https://monitoring.bakewise.ai/signoz → Alerts
|
||
```
|
||
|
||
2. **Create Common Alerts:**
|
||
|
||
**Alert 1: High Error Rate**
|
||
- Name: `HighErrorRate`
|
||
- Query: `error_rate > 5` for `5 minutes`
|
||
- Severity: `critical`
|
||
- Description: "Service {{service_name}} has error rate >5%"
|
||
|
||
**Alert 2: High Latency**
|
||
- Name: `HighLatency`
|
||
- Query: `P99_latency > 3000ms` for `5 minutes`
|
||
- Severity: `warning`
|
||
- Description: "Service {{service_name}} P99 latency >3s"
|
||
|
||
**Alert 3: Service Down**
|
||
- Name: `ServiceDown`
|
||
- Query: `request_rate == 0` for `2 minutes`
|
||
- Severity: `critical`
|
||
- Description: "Service {{service_name}} not receiving requests"
|
||
|
||
**Alert 4: Database Connection Issues**
|
||
- Name: `DatabaseConnectionsHigh`
|
||
- Query: `pg_active_connections > 80` for `5 minutes`
|
||
- Severity: `warning`
|
||
- Description: "Database {{database}} connection count >80%"
|
||
|
||
**Alert 5: High Memory Usage**
|
||
- Name: `HighMemoryUsage`
|
||
- Query: `container_memory_percent > 85` for `5 minutes`
|
||
- Severity: `warning`
|
||
- Description: "Pod {{pod_name}} using >85% memory"
|
||
|
||
#### Test Alert Delivery
|
||
|
||
```bash
|
||
# Method 1: Create a test alert in SigNoz UI
|
||
# Go to Alerts → New Alert → Set a test condition that will fire
|
||
|
||
# Method 2: Fire a test alert via stress test
|
||
kubectl run memory-test --image=polinux/stress --restart=Never \
|
||
--namespace=bakery-ia -- stress --vm 1 --vm-bytes 600M --timeout 300s
|
||
|
||
# Check alert appears in SigNoz Alerts tab
|
||
# https://monitoring.bakewise.ai/signoz → Alerts
|
||
|
||
# Also check AlertManager
|
||
# https://monitoring.bakewise.ai/alertmanager
|
||
|
||
# Verify email notification received
|
||
|
||
# Clean up test
|
||
kubectl delete pod memory-test -n bakery-ia
|
||
```
|
||
|
||
#### Configure Notification Channels
|
||
|
||
In SigNoz Alerts tab, configure channels:
|
||
|
||
1. **Email Channel:**
|
||
- Already configured via AlertManager
|
||
- Emails sent to addresses in signoz-values-prod.yaml
|
||
|
||
2. **Slack Channel (Optional):**
|
||
```bash
|
||
# Add Slack webhook URL to signoz-values-prod.yaml
|
||
# Under alertmanager.config.receivers.critical-alerts.slack_configs:
|
||
# - api_url: 'https://hooks.slack.com/services/YOUR/WEBHOOK/URL'
|
||
# channel: '#alerts-critical'
|
||
```
|
||
|
||
3. **Webhook Channel (Optional):**
|
||
- Configure custom webhook for integration with PagerDuty, OpsGenie, etc.
|
||
- Add to alertmanager.config.receivers
|
||
|
||
### Step 3: Configure Backups
|
||
|
||
```bash
|
||
# Create backup script on VPS
|
||
cat > ~/backup-databases.sh <<'EOF'
|
||
#!/bin/bash
|
||
BACKUP_DIR="/backups/$(date +%Y-%m-%d)"
|
||
mkdir -p $BACKUP_DIR
|
||
|
||
# Get all database pods
|
||
DBS=$(kubectl get pods -n bakery-ia -l app.kubernetes.io/component=database -o name)
|
||
|
||
for db in $DBS; do
|
||
DB_NAME=$(echo $db | cut -d'/' -f2)
|
||
echo "Backing up $DB_NAME..."
|
||
|
||
kubectl exec -n bakery-ia $db -- pg_dump -U postgres > "$BACKUP_DIR/${DB_NAME}.sql"
|
||
done
|
||
|
||
# Compress backups
|
||
tar -czf "$BACKUP_DIR.tar.gz" "$BACKUP_DIR"
|
||
rm -rf "$BACKUP_DIR"
|
||
|
||
# Keep only last 7 days
|
||
find /backups -name "*.tar.gz" -mtime +7 -delete
|
||
|
||
echo "Backup completed: $BACKUP_DIR.tar.gz"
|
||
EOF
|
||
|
||
chmod +x ~/backup-databases.sh
|
||
|
||
# Test backup
|
||
./backup-databases.sh
|
||
|
||
# Setup daily cron job (2 AM)
|
||
(crontab -l 2>/dev/null; echo "0 2 * * * ~/backup-databases.sh") | crontab -
|
||
```
|
||
|
||
### Step 3: Setup Alerting
|
||
|
||
```bash
|
||
# Update AlertManager configuration with your email
|
||
kubectl edit configmap -n monitoring alertmanager-config
|
||
|
||
# Update recipient emails in the routes section
|
||
```
|
||
|
||
### Step 4: Verify SigNoz Monitoring is Working
|
||
|
||
Before proceeding, ensure all monitoring components are operational:
|
||
|
||
```bash
|
||
# 1. Verify SigNoz pods are running
|
||
kubectl get pods -n bakery-ia -l app.kubernetes.io/instance=signoz
|
||
|
||
# Expected pods (all should be Running/Ready):
|
||
# - signoz-0 (or signoz-1, signoz-2 for HA)
|
||
# - signoz-otel-collector-xxx
|
||
# - signoz-alertmanager-xxx
|
||
# - signoz-clickhouse-xxx
|
||
# - signoz-zookeeper-xxx
|
||
|
||
# 2. Check SigNoz UI is accessible
|
||
curl -I https://monitoring.bakewise.ai/signoz
|
||
# Should return: HTTP/2 200 OK
|
||
|
||
# 3. Verify OTel Collector is receiving data
|
||
kubectl logs -n bakery-ia deployment/signoz-otel-collector --tail=100 | grep -i "received"
|
||
# Should show: "Traces received: X" "Metrics received: Y" "Logs received: Z"
|
||
|
||
# 4. Check ClickHouse database is healthy
|
||
kubectl exec -n bakery-ia deployment/signoz-clickhouse -- clickhouse-client --query="SELECT count() FROM system.tables WHERE database LIKE 'signoz_%'"
|
||
# Should return a number > 0 (tables exist)
|
||
```
|
||
|
||
**Complete Verification Checklist:**
|
||
|
||
- [ ] **SigNoz UI loads** at https://monitoring.bakewise.ai/signoz
|
||
- [ ] **Services tab shows all 18 microservices** with metrics
|
||
- [ ] **Traces tab has sample traces** from gateway and other services
|
||
- [ ] **Dashboards tab shows PostgreSQL metrics** from all 18 databases
|
||
- [ ] **Dashboards tab shows Redis metrics** (memory, commands, etc.)
|
||
- [ ] **Dashboards tab shows RabbitMQ metrics** (queues, messages)
|
||
- [ ] **Dashboards tab shows Kubernetes metrics** (nodes, pods)
|
||
- [ ] **Logs tab displays logs** from all services in bakery-ia namespace
|
||
- [ ] **Alerts tab is accessible** and can create new alerts
|
||
- [ ] **AlertManager** is reachable at https://monitoring.bakewise.ai/alertmanager
|
||
|
||
**If any checks fail, troubleshoot:**
|
||
|
||
```bash
|
||
# Check OTel Collector configuration
|
||
kubectl describe configmap -n bakery-ia signoz-otel-collector
|
||
|
||
# Check for errors in OTel Collector
|
||
kubectl logs -n bakery-ia deployment/signoz-otel-collector | grep -i error
|
||
|
||
# Check ClickHouse is accepting writes
|
||
kubectl logs -n bakery-ia deployment/signoz-clickhouse | grep -i error
|
||
|
||
# Restart OTel Collector if needed
|
||
kubectl rollout restart deployment/signoz-otel-collector -n bakery-ia
|
||
```
|
||
|
||
### Step 5: Document Everything
|
||
|
||
Create a secure runbook with all credentials and procedures:
|
||
|
||
**Essential Information to Document:**
|
||
- [ ] VPS login credentials (stored securely in password manager)
|
||
- [ ] Database passwords (in password manager)
|
||
- [ ] Grafana admin password
|
||
- [ ] Domain registrar access (for bakewise.ai)
|
||
- [ ] Cloudflare access
|
||
- [ ] Email service credentials (SMTP)
|
||
- [ ] WhatsApp API credentials
|
||
- [ ] Docker Hub / Registry credentials
|
||
- [ ] Emergency contact information
|
||
- [ ] Rollback procedures
|
||
- [ ] Monitoring URLs and access procedures
|
||
|
||
### Step 6: Train Your Team
|
||
|
||
Conduct a training session covering SigNoz and operational procedures:
|
||
|
||
#### Part 1: SigNoz Navigation (30 minutes)
|
||
|
||
- [ ] **Login and Overview**
|
||
- Show how to access https://monitoring.bakewise.ai/signoz
|
||
- Navigate through main tabs: Services, Traces, Dashboards, Logs, Alerts
|
||
- Explain the unified nature of SigNoz (all-in-one platform)
|
||
|
||
- [ ] **Services Tab - Application Performance Monitoring**
|
||
- Show all 18 microservices
|
||
- Explain RED metrics (Request rate, Error rate, Duration/latency)
|
||
- Demo: Click on a service → Operations → See endpoint breakdown
|
||
- Demo: Identify slow endpoints and high error rates
|
||
|
||
- [ ] **Traces Tab - Request Flow Debugging**
|
||
- Show how to search for traces by service, operation, or time
|
||
- Demo: Click on a trace → See full waterfall (service → database → cache)
|
||
- Demo: Find slow database queries in trace spans
|
||
- Demo: Click "View Logs" to correlate trace with logs
|
||
|
||
- [ ] **Dashboards Tab - Infrastructure Monitoring**
|
||
- Navigate to PostgreSQL dashboard → Show all 18 databases
|
||
- Navigate to Redis dashboard → Show cache metrics
|
||
- Navigate to Kubernetes dashboard → Show node/pod metrics
|
||
- Explain what metrics indicate issues (connection %, memory %, etc.)
|
||
|
||
- [ ] **Logs Tab - Log Search and Analysis**
|
||
- Show how to filter by service, severity, time range
|
||
- Demo: Search for "error" in last hour
|
||
- Demo: Click on trace_id in log → Jump to related trace
|
||
- Show Kubernetes metadata (pod, namespace, container)
|
||
|
||
- [ ] **Alerts Tab - Proactive Monitoring**
|
||
- Show how to create alerts on metrics
|
||
- Review pre-configured alerts
|
||
- Show alert history and firing alerts
|
||
- Explain how to acknowledge/silence alerts
|
||
|
||
#### Part 2: Operational Tasks (30 minutes)
|
||
|
||
- [ ] **Check application logs** (multiple ways)
|
||
```bash
|
||
# Method 1: Via kubectl (for immediate debugging)
|
||
kubectl logs -n bakery-ia deployment/orders-service --tail=100 -f
|
||
|
||
# Method 2: Via SigNoz Logs tab (for analysis and correlation)
|
||
# 1. Open https://monitoring.bakewise.ai/signoz → Logs
|
||
# 2. Filter by k8s_deployment_name: orders-service
|
||
# 3. Click on trace_id to see related request flow
|
||
```
|
||
|
||
- [ ] **Restart services when needed**
|
||
```bash
|
||
# Restart a service (rolling update, no downtime)
|
||
kubectl rollout restart deployment/orders-service -n bakery-ia
|
||
|
||
# Verify restart in SigNoz:
|
||
# 1. Check Services tab → orders-service → Should show brief dip then recovery
|
||
# 2. Check Logs tab → Filter by orders-service → See restart logs
|
||
```
|
||
|
||
- [ ] **Investigate performance issues**
|
||
```bash
|
||
# Scenario: "Orders API is slow"
|
||
# 1. SigNoz → Services → orders-service → Check P99 latency
|
||
# 2. SigNoz → Traces → Filter service:orders-service, duration:>1s
|
||
# 3. Click on slow trace → Identify bottleneck (DB query? External API?)
|
||
# 4. SigNoz → Dashboards → PostgreSQL → Check orders_db connections/queries
|
||
# 5. Fix identified issue (add index, optimize query, scale service)
|
||
```
|
||
|
||
- [ ] **Respond to alerts**
|
||
- Show how to access alerts in SigNoz → Alerts tab
|
||
- Show AlertManager UI at https://monitoring.bakewise.ai/alertmanager
|
||
- Review common alerts and their resolution steps
|
||
- Reference the [Production Operations Guide](./PRODUCTION_OPERATIONS_GUIDE.md)
|
||
|
||
#### Part 3: Documentation and Resources (10 minutes)
|
||
|
||
- [ ] **Share documentation**
|
||
- [PILOT_LAUNCH_GUIDE.md](./PILOT_LAUNCH_GUIDE.md) - This guide (deployment)
|
||
- [PRODUCTION_OPERATIONS_GUIDE.md](./PRODUCTION_OPERATIONS_GUIDE.md) - Daily operations with SigNoz
|
||
- [security-checklist.md](./security-checklist.md) - Security procedures
|
||
|
||
- [ ] **Bookmark key URLs**
|
||
- SigNoz: https://monitoring.bakewise.ai/signoz
|
||
- AlertManager: https://monitoring.bakewise.ai/alertmanager
|
||
- Production app: https://bakewise.ai
|
||
|
||
- [ ] **Setup on-call rotation** (if applicable)
|
||
- Configure rotation schedule in AlertManager
|
||
- Document escalation procedures
|
||
- Test alert delivery to on-call phone/email
|
||
|
||
#### Part 4: Hands-On Exercise (15 minutes)
|
||
|
||
**Exercise: Investigate a Simulated Issue**
|
||
|
||
1. Create a load test to generate traffic
|
||
2. Use SigNoz to find the slowest endpoint
|
||
3. Identify the root cause using traces
|
||
4. Correlate with logs to confirm
|
||
5. Check infrastructure metrics (DB, memory, CPU)
|
||
6. Propose a fix based on findings
|
||
|
||
This trains the team to use SigNoz effectively for real incidents.
|
||
|
||
---
|
||
|
||
## Troubleshooting
|
||
|
||
### Issue: Pods Not Starting
|
||
|
||
```bash
|
||
# Check pod status
|
||
kubectl describe pod POD_NAME -n bakery-ia
|
||
|
||
# Common causes:
|
||
# 1. Image pull errors
|
||
kubectl get events -n bakery-ia | grep -i "pull"
|
||
|
||
# 2. Resource limits
|
||
kubectl describe node
|
||
|
||
# 3. Volume mount issues
|
||
kubectl get pvc -n bakery-ia
|
||
```
|
||
|
||
### Issue: Certificate Not Issuing
|
||
|
||
```bash
|
||
# Check certificate status
|
||
kubectl describe certificate bakery-ia-prod-tls-cert -n bakery-ia
|
||
|
||
# Check cert-manager logs
|
||
kubectl logs -n cert-manager deployment/cert-manager
|
||
|
||
# Check challenges
|
||
kubectl get challenges -n bakery-ia
|
||
|
||
# Verify DNS is correct
|
||
nslookup bakery.yourdomain.com
|
||
```
|
||
|
||
### Issue: Database Connection Errors
|
||
|
||
```bash
|
||
# Check database pod
|
||
kubectl get pods -n bakery-ia -l app.kubernetes.io/component=database
|
||
|
||
# Check database logs
|
||
kubectl logs -n bakery-ia deployment/auth-db
|
||
|
||
# Test connection from service pod
|
||
kubectl exec -n bakery-ia deployment/auth-service -- nc -zv auth-db 5432
|
||
```
|
||
|
||
### Issue: Services Can't Connect to Databases
|
||
|
||
```bash
|
||
# Check if SSL is enabled
|
||
kubectl exec -n bakery-ia deployment/auth-db -- sh -c \
|
||
'psql -U auth_user -d auth_db -c "SHOW ssl;"'
|
||
|
||
# Check service logs for SSL errors
|
||
kubectl logs -n bakery-ia deployment/auth-service | grep -i "ssl\|tls"
|
||
|
||
# Restart service to pick up new SSL config
|
||
kubectl rollout restart deployment/auth-service -n bakery-ia
|
||
```
|
||
|
||
### Issue: Out of Resources
|
||
|
||
```bash
|
||
# Check node resources
|
||
kubectl top nodes
|
||
|
||
# Check pod resource usage
|
||
kubectl top pods -n bakery-ia
|
||
|
||
# Identify resource hogs
|
||
kubectl top pods -n bakery-ia --sort-by=memory
|
||
|
||
# Scale down non-critical services temporarily
|
||
kubectl scale deployment monitoring -n bakery-ia --replicas=0
|
||
```
|
||
|
||
---
|
||
|
||
## Next Steps After Successful Launch
|
||
|
||
1. **Monitor for 48 Hours**
|
||
- Check dashboards daily
|
||
- Review error logs
|
||
- Monitor resource usage
|
||
- Test all functionality
|
||
|
||
2. **Optimize Based on Metrics**
|
||
- Adjust resource limits if needed
|
||
- Fine-tune autoscaling thresholds
|
||
- Optimize database queries if slow
|
||
|
||
3. **Onboard First Tenant**
|
||
- Create test tenant
|
||
- Upload sample data
|
||
- Test all features
|
||
- Gather feedback
|
||
|
||
4. **Scale Gradually**
|
||
- Add 1-2 tenants at a time
|
||
- Monitor resource usage
|
||
- Upgrade VPS if needed (see scaling guide)
|
||
|
||
5. **Plan for Growth**
|
||
- Review [PRODUCTION_OPERATIONS_GUIDE.md](./PRODUCTION_OPERATIONS_GUIDE.md)
|
||
- Implement additional monitoring
|
||
- Plan capacity upgrades
|
||
- Consider managed services for scale
|
||
|
||
---
|
||
|
||
## Cost Scaling Path
|
||
|
||
| Tenants | RAM | CPU | Storage | Monthly Cost |
|
||
|---------|-----|-----|---------|--------------|
|
||
| 10 | 20 GB | 8 cores | 200 GB | €40-80 |
|
||
| 25 | 32 GB | 12 cores | 300 GB | €80-120 |
|
||
| 50 | 48 GB | 16 cores | 500 GB | €150-200 |
|
||
| 100+ | Consider multi-node cluster or managed K8s | €300+ |
|
||
|
||
---
|
||
|
||
## Support Resources
|
||
|
||
**Documentation:**
|
||
- **Operations Guide:** [PRODUCTION_OPERATIONS_GUIDE.md](./PRODUCTION_OPERATIONS_GUIDE.md) - Daily operations, monitoring, incident response
|
||
- **Security Guide:** [security-checklist.md](./security-checklist.md) - Security procedures and compliance
|
||
- **Database Security:** [database-security.md](./database-security.md) - Database operations and TLS configuration
|
||
- **TLS Configuration:** [tls-configuration.md](./tls-configuration.md) - Certificate management
|
||
- **RBAC Implementation:** [rbac-implementation.md](./rbac-implementation.md) - Access control
|
||
|
||
**Monitoring Access:**
|
||
- **SigNoz (Primary):** https://monitoring.bakewise.ai/signoz - All-in-one observability
|
||
- Services: Application performance monitoring (APM)
|
||
- Traces: Distributed tracing across all services
|
||
- Dashboards: PostgreSQL, Redis, RabbitMQ, Kubernetes metrics
|
||
- Logs: Centralized log management with trace correlation
|
||
- Alerts: Alert configuration and management
|
||
- **AlertManager:** https://monitoring.bakewise.ai/alertmanager - Alert routing and notifications
|
||
|
||
**External Resources:**
|
||
- **MicroK8s Docs:** https://microk8s.io/docs
|
||
- **Kubernetes Docs:** https://kubernetes.io/docs
|
||
- **Let's Encrypt:** https://letsencrypt.org/docs
|
||
- **Cloudflare DNS:** https://developers.cloudflare.com/dns
|
||
- **SigNoz Documentation:** https://signoz.io/docs/
|
||
- **OpenTelemetry Documentation:** https://opentelemetry.io/docs/
|
||
|
||
**Monitoring Architecture:**
|
||
- **OpenTelemetry:** Industry-standard instrumentation framework
|
||
- Auto-instruments FastAPI, HTTPX, SQLAlchemy, Redis
|
||
- Collects traces, metrics, and logs from all services
|
||
- Exports to SigNoz via OTLP protocol (gRPC port 4317, HTTP port 4318)
|
||
- **SigNoz Components:**
|
||
- **Frontend:** Web UI for visualization and analysis
|
||
- **OTel Collector:** Receives and processes telemetry data
|
||
- **ClickHouse:** Time-series database for fast queries
|
||
- **AlertManager:** Alert routing and notification delivery
|
||
- **Zookeeper:** Coordination service for ClickHouse cluster
|
||
|
||
---
|
||
|
||
## Summary Checklist
|
||
|
||
### Pre-Deployment Configuration (LOCAL MACHINE)
|
||
- [x] **Production secrets configured** - ✅ JWT, database passwords, API keys (ALREADY DONE)
|
||
- [ ] **External service credentials** - Update SMTP, WhatsApp, Stripe in secrets.yaml
|
||
- [ ] **imagePullSecrets removed** - Delete from all 67 manifests
|
||
- [ ] **Image tags updated** - Change all 'latest' to v1.0.0 (semantic version)
|
||
- [x] **SigNoz namespace fixed** - ✅ Already done (bakery-ia namespace)
|
||
- [x] **Cert-manager email updated** - ✅ Already set to admin@bakewise.ai
|
||
- [ ] **Stripe publishable key updated** - Replace `pk_test_...` with production key in configmap.yaml
|
||
- [x] **Pilot mode verified** - ✅ VITE_PILOT_MODE_ENABLED=true (default is correct)
|
||
- [ ] **Manifests validated** - No 'latest' tags, no imagePullSecrets remaining
|
||
|
||
### Infrastructure Setup
|
||
- [ ] VPS provisioned and accessible
|
||
- [ ] k3s (or Kubernetes) installed and configured
|
||
- [ ] nginx-ingress-controller installed
|
||
- [ ] metrics-server installed and working
|
||
- [ ] cert-manager installed
|
||
- [ ] local-path-provisioner installed
|
||
- [ ] Domain registered and DNS configured
|
||
- [ ] Cloudflare protection enabled (optional but recommended)
|
||
|
||
### Secrets and Configuration
|
||
- [ ] TLS certificates generated (postgres, redis)
|
||
- [ ] Email service configured and tested
|
||
- [ ] WhatsApp API setup (optional for launch)
|
||
- [ ] Container images built and pushed with version tags
|
||
- [ ] Production configs verified (domains, CORS, storage class)
|
||
- [ ] Strong passwords generated for all services
|
||
- [ ] Docker registry secret created (dockerhub-creds)
|
||
- [ ] Application secrets applied
|
||
|
||
### Monitoring
|
||
- [ ] SigNoz deployed via Helm
|
||
- [ ] SigNoz pods running and healthy
|
||
- [ ] SigNoz in bakery-ia namespace
|
||
|
||
### CI/CD Infrastructure (Optional)
|
||
- [ ] Gitea deployed and accessible
|
||
- [ ] Gitea admin user created
|
||
- [ ] Repository created and code pushed
|
||
- [ ] Tekton Pipelines installed
|
||
- [ ] Tekton Triggers configured
|
||
- [ ] Tekton Helm chart deployed
|
||
- [ ] Webhook configured in Gitea
|
||
- [ ] Flux CD installed
|
||
- [ ] GitRepository and Kustomization configured
|
||
- [ ] End-to-end pipeline test successful
|
||
|
||
### Email Infrastructure (Optional - Mailu)
|
||
- [ ] Unbound DNS resolver deployed
|
||
- [ ] CoreDNS configured for DNSSEC
|
||
- [ ] Mailu TLS certificate created
|
||
- [ ] Mailu deployed via Helm
|
||
- [ ] Admin user created
|
||
- [ ] DKIM record added to DNS
|
||
- [ ] Email sending/receiving tested
|
||
|
||
### Geocoding (Optional - Nominatim)
|
||
- [ ] Nominatim deployed
|
||
- [ ] Health check passing
|
||
- [ ] Application configured to use Nominatim
|
||
|
||
### Application Deployment
|
||
- [ ] All pods running successfully
|
||
- [ ] Databases accepting TLS connections
|
||
- [ ] Let's Encrypt certificates issued
|
||
- [ ] Frontend accessible via HTTPS
|
||
- [ ] API health check passing
|
||
- [ ] Test user can login
|
||
- [ ] Email delivery working
|
||
- [ ] SigNoz monitoring accessible
|
||
- [ ] Metrics flowing to SigNoz
|
||
- [ ] **Pilot coupon verified** - Check tenant-service logs for "Pilot coupon created successfully"
|
||
|
||
### Post-Deployment
|
||
- [ ] Backups configured and tested
|
||
- [ ] Team trained on operations
|
||
- [ ] Documentation complete
|
||
- [ ] Emergency procedures documented
|
||
- [ ] Monitoring alerts configured
|
||
|
||
---
|
||
|
||
**🎉 Congratulations! Your Bakery-IA platform is now live in production!**
|
||
|
||
*Estimated total time: 2-4 hours for first deployment*
|
||
*Subsequent updates: 15-30 minutes*
|
||
|
||
---
|
||
|
||
**Document Version:** 3.0
|
||
**Last Updated:** 2026-01-21
|
||
**Maintained By:** DevOps Team
|
||
|
||
**Changes in v3.0:**
|
||
- **NEW: Infrastructure Architecture Overview** - Added component layers diagram and deployment dependencies
|
||
- **NEW: CI/CD Infrastructure Deployment** - Complete guide for Gitea, Tekton, and Flux CD
|
||
- Step-by-step Gitea installation with container registry
|
||
- Tekton Pipelines and Triggers setup via Helm
|
||
- Flux CD GitOps configuration
|
||
- Webhook integration and end-to-end testing
|
||
- Troubleshooting guide for CI/CD issues
|
||
- **NEW: Mailu Email Server Deployment** - Comprehensive self-hosted email setup
|
||
- Unbound DNS resolver deployment for DNSSEC
|
||
- CoreDNS configuration for mail authentication
|
||
- Mailu Helm deployment with all components
|
||
- DKIM/SPF/DMARC configuration
|
||
- Troubleshooting common Mailu issues
|
||
- **NEW: Nominatim Geocoding Service** - Address lookup service deployment
|
||
- **NEW: SigNoz Monitoring Deployment** - Dedicated section (previously embedded)
|
||
- **UPDATED: Table of Contents** - Reorganized with new sections (18 sections total)
|
||
- **UPDATED: Summary Checklist** - Added CI/CD, Email, and Geocoding verification items
|
||
- **UPDATED: Infrastructure Components Summary** - Added all optional components with namespaces
|
||
|
||
**Changes in v2.1:**
|
||
- Updated DNS configuration for Namecheap (primary) with Cloudflare as optional
|
||
- Clarified MicroK8s ingress class is `public` (not `nginx`)
|
||
- Updated Let's Encrypt ClusterIssuer documentation to reference pre-configured files
|
||
- Added firewall requirements for clouding.io VPS
|
||
- Emphasized port 80/443 requirements for HTTP-01 challenges
|
||
|
||
**Changes in v2.0:**
|
||
- Added critical pre-deployment fixes section
|
||
- Updated infrastructure setup for MicroK8s
|
||
- Added required component installation (nginx-ingress, metrics-server, etc.)
|
||
- Updated configuration steps with domain replacement
|
||
- Added Docker registry secret creation
|
||
- Added SigNoz Helm deployment before application
|
||
- Updated storage class configuration
|
||
- Added image tag version requirements
|
||
- Expanded verification checklist
|